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CHAPTER  1 
INTRODUCTION 

Many  of  the  noises  to  which  man  is  exposed  serve  also  to  convey 
information  about  the  environment  in  which  he  lives.  The  noises  serve  as 
a masking  stimulus  which  decreases  his  ability  to  use  or  enjoy  desirable 
signals  such  as  speech  signals  or  music.  In  addition,  however,  the  ever- 
present noise  stimuli  can  convey  useful  information  and  under  proper  conditions 
such  noises  allow  the  listener  to  identify  the  source  and  to  make  inferences 
about  it. 

The  role  of  noise  waveforms  is  of  considerable  significance  in  a number 
of  situations  ranging  from  the  need  of  a submarine  to  detect  and  identify 
hostile  or  otherwise  potentially  hazardous  'targets'  to  that  of  a lathe 
operator's  ability  to  adjust  the  cutting  rate  of  his  machine.  The  ability 
to  identify  sources  of  ocean-borne  noise  is  of  great  concern  to  the  Navy  and, 
in  spite  of  ever  more  sophisticated  special  purpose  processing  machines, 
the  role  of  the  sonar  operator  in  this  context  is  still  significant.  Recently 
Urlck  and  Gaunaurd  (1972),  Stallard  and  Leslie  (1974)  have  addressed  the 
question  of  psychophysics  in  sonar  detection,  and  a considerable  body  of 
experimental  work  has  dealt  with  the  related  question  of  parameters  important 
in  speech  recognition  (Stevens  and  House,  1972.) 

The  objective  of  this  report  is  to  review  relevant  knowledge  in  the 
context  of  passive  sonar  aural  recognition  of  a noise  source,  and  to  define 
an  experimental  approach  to  extend  the  state  of  knowledge  in  areas  of 
immediate  concern  to  the  sponsor,  the  Naval  Air  Systems  Command  (NAVAIR.) 

The  exposition  which  follows  will  touch  on  a number  of  areas  to  include  ' 
auditory  detection,  learning,  response  uncertainty,  and  subject  selection  and 
instruction.  The  results  of  pilot  studies  using  trained  college  age  listeners 
are  also  presented. 

The  report  is  prepared  in  two  parts  with  part  1 summarizing  the  sonar 
classification  task  and  then  leading  into  the  relationships  of  this  task  to 
the  general  topic  of  acoustic  warfare.  Part  2 considers  the  classification 
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task  in  more  general  terms.  The  results  obtained  by  a number  of  experimenters 
in  psychology  are  discussed  in  their  relationship  to  the  present  effort. 

Also,  various  statistical  techniques  and  controls  exercised  during 
experiments  done  to  date  are  presented. 
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CHAPTER  II 

THE  AUDITORY  RECOGNITION  TASK 


2.1  General 

In  this  chapter  the  problem  posed  by  the  need  of  the  listener  to 
identify  or  recognize  an  auditory  source  is  investigated.  Such  recognition 
first  implies  that  the  listener  has  somehow  been  convinced  of  the  presence 
of  the  source  and  must  now  decide  to  which  of  a number  of  possible  classes 
of  sounds  the  source  belongs.  Indeed  in  practice  the  processes  of  detection 
and  classification  are  closely  related  as  will  be  seen  in  what  follows. 

Also,  in  most  cases  of  practical  interest  the  source  will  be  heard  in  the 
presence  of  some  extraneous  noises  which  will  tend  to  mask  the  presence  of 
the  source  as  well  as  confound  the  recognition  or  classification  task. 


To  cast  the  problem  in  more  concrete  terms,  consider  the  case  of  a 
sonar  operator  using  aural  means  only  and  facing  the  prospect  of  receiving 
the  acoustic  emanations  of  only  a single  source,  a target.  We  may  define 
the  task  precisely  using  a parallel  to  the  general  estimation  model  (Van 
Trees,  1968)  modified  to  include  some  concepts  from  pattern  recognition 
(Fu,  1968.) 

[a]  A source,  the  target,  may  for  our  purposes  be  classified  by 
a set  of  acoustic  parameters.  These  parameters  a define  a 
multi-dimensional  parameter  space  V and  they  may  be  associated 
with  such  features  as  the  absence  or  presence  of  certain 
bands  of  noise,  the  modulation  characteristics,  or  the  time 
dependence  of  the  sound.  We  will  denote  the  pattern 
corresponding  to  source  j as  a^  and  the  features  as  , 
see  Figure  1.1 


[b]  The  source  parameters  are  mapped  into  an  observation  space 
6 according  to  some  probabilistic  law.  The  observable  R 
is  in  our  case  the  listeners  response  to  the  stimulus.  The 
observation,  as  perceived  by  the  listener,  Includes  the 
effects  of  the  auditory  transducer  as  well  as  the  subsequent 
neural  and  higher  level  cognitive  processes. 
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Figure  1.1  General  Classification  Model 
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[c]  It  is  suggested  that  the  listener  perceives  the  set  of 
observables  as  features  and  organizes  these  into  patterns. 

An  N dimensional  observation  space  can  then  be  reduced 

to  an  1 dimensional  feature  space.  As  an  exam.ple  of  this 
process,  we  can  consider  the  critical  band  concept  which 
successfully  explains  many  auditory  phenomena.  (Plomp  and 
Smoorenburg  1970,  Zeicker,  Flottorp,  and  Stevens,  1957.) 

[d]  It  is  further  proposed  that  the  listener  will  try  to  compare 
the  pattern  he  hears  to  one  of  a number  of  patterns  previously 
learned  in  order  to  arrive  at  an  identification  of  the  source 
CO  (R). 


In  any  realistic  listener  classification  task,  the  patterns  co^ , the 
features,  and  hence  parameters  a occur  with  some  a priori  probability.  The 
likelihood  of  a specific  pattern  strongly  influences  the  classification 
problem  by  restricting  the  number  of  patterns  wliich  the  listener  expects  to 
hear . 


In  the  above  model  of  the  recognition  task,  the  listener  is  seen  to 
perform  the  functions  of  feature  extraction  and  pattern  classification.  The 
probabilistic  mapping  from  the  parameter  to  the  observation  space  includes 
such  perturbations  as  are  introduced  by  the  presence  of  noise  either  between 
the  source  and  the  listener's  auditory  end-organ  or  subsequently  in  auditory 
pathways . 

2.2  Patterns  with  Dichotomous  Features 

As  a foundation  from  which  to  proceed  to  consider  realistic  recognition 
tasks  and  which  may  allow  for  some  psychoacoustic  tests  of  the  concepts,  we 
now  consider  the  case  of  dichotomous  features.  Specifically,  we  wish  to 
treat  the  case  where  source  acoustic  patterns  differ  by  either  having  or  not 
having  specific  features. 

To  further  simplify,  consider  the  recognition  task  where  the  listener 
must  decide  between  two  signal  patterns  only,  and  they  differ  by  the 

presence  or  absence  of  a single  feature  The  listener  then  is  faced 

with  the  following  choices: 

a)  If  the  perceived  pattern  is  thought  to  have  feature  , say  that 
it  is  pattern  CJ^^. 


I 
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b)  If  feature  is  thought  to  be  missing,  say  that  the  pattern 
heard  is  pattern  ao^ . 

In  this  case  the  listener  is  then  faced  with  the  task  of  deciding 
between  two  hypotheses: 

H_:  Feature  is  absent,  decide  pattern 

Feature  is  present,  decide  pattern 

on  the  basis  of  a sensural  input. 

If  feature  were  the  only  feature  of  the  signal,  the  problem  would 
reduce  to  that  of  detection  of  a signal.  While  the  detection  problem  is 
certainly  of  interest,  in  the  present  work  we  are  rather  concerned  with 
distinguishing  between  pattern  classes.  The  detection  problem  has  been 
treated  elsewhere  (Andrews  and  llovater,  1971,  Urick  and  Gaunaurd , 1972) 
but  it  is  instructive  in  the  present  case  to  state  some  of  the  applicable 
concepts . 

It  is  reasonable  to  suspect  that  the  decision  between  Hq  and  requires 
at  the  very  least  an  opportunity  for  the  listener  to  detect  feature  when 
hypothesis  Hj^  applies.  If  in  fact  this  detection  opportunity  is  sufficient 
for  making  the  decision  is  as  yet  an  open  question.  In  attempting  to  analyze 
the  problem  in  more  detail,  the  theory  of  signal  detectability  will  be 
reviewed.  Also,  it  will  be  seen  later  that  in  some  experimental  cases  the 
detection  of  dichotomous  features  seems  to  be  an  adequate  basis  for  a 
classification  decision. 

2.3  The  Theory  of  Signal  Detectability 

The  theory  of  signal  detectability  is  a mathematical  model  of  the  mapping 
of  a set  of  stimuli,  consisting  either  of  noise  alone  or  from  signal  plus 
noise,  into  a two-point  decision  space:  "no"  the  signal  was  not  contained 

in  the  input,  or  "yes"  the  signal  was  contained  in  the  input  (Tanner  and 
Sorkin,  1972.)  While  these  models  are  not  intended  to  be  descriptions  of  the 
way  a human  observer  makes  this  binary  decision,  they  arc  normative  models 
with  which  the  observer's  performance  can  be  compared. 
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The  models  comprising  this  tlieory  are  applicable  to  any  receiver 
operating  on  a set  of  inputs  and  as  such,  do  not  take  into  account  the 
properties  of  the  human  per  se.  A number  of  workers  have  successfully 
demonstrated  the  utility  of  the  theory  of  signal  detectability  of  visual 
as  well  as  auditory  stimuli  (Green  and  Swets,  1966,  Peterson,  Birdsall 
and  Fox,  1954,  Swets,  1964.)  It  is  generally  found  that  the  human 
performance  falls  short  of  that  predicted  for  an  ideal  detector. 

The  radiated  noise  of  a marine  target  consists  of  a broad  band  noise 
spectrum  with  perhaps  some  significant  tonals  associated  with  machinery  on 
board.  The  problem  of  detecting  the  presence  of  various  broadband  components 
reduces  to  a Lest  of  hypotheses  about  signals  which  are  only  known  statistically. 
That  is,  the  exact  phases  of  the  components,  their  Instantaneous  amplitudes, 
bandwidths,  etc.  are  at  best  known  in  terms  of  their  averaged  power  spectra. 

The  theory  of  signal  detectability  for  broadband  features  leads  to  a like- 
lihood ratio  test  for  signals  known  statistically.  If  we  further  assume  that 
the  radiated  noise  can  be  modeled  as  a collection  of  samples  with  identical 
Gaussian  probability  densities,  the  ideal  processor  is  an  energy  detector 
(Van  Trees,  1968.) 


From  the  sampling  theorem  in  the  time  domain,  a band-limited  signal  with 
bandwidth  W can  be  exactly  determined  by  2WT  samples.  These  samples  will  be 
statistically  independent  if  the  process  is  Gaussian  (Lathi,  1968.)  The 
maximum  log  likelihood  ratio  test  is  then  given  by 


X(R) 


2WT 
= Z 
i=l  ^ 


L(R^) 


where  the  threshold  Y includes  all  sample  invariant  factors  as  well  as  the 
criteria.  (Van  Trees,  1968.)  We  can  see  that  the  likelihood  ratio  test'  consists 
of  computing  the  sum  of  squares  of  the  statistically  independent  sample  data 
points  and  comparing  this  to  an  appropriate  threshold.  Since  the  are 

Gaussian  random  variables,  the  sufficient  statistic  X(R)  is  a random  variable 
with  a gamma  distribution 

P„(x) X , 0 < X < “•  (2.1) 

* r(a)6“ 
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* 

I 


In  this  case 

d = v/2  = WT,  B = 2al  (Hogg,  1970). 

n 

The  parameter  0^  is  given  by: 

under  H i£  R is  n(o,0^) 
h n o n 

of  =0^+0^  under  H,  where  R is  n(o,0^  + o^) 
h n s 1 '’ns 

The  notation  n(h,a^)  is  used  to  denote  a random  variable  with  Gaussian 

probability  density  which  has  a mean  y and  a variance  O^.  Equation  (2.1) 

reduces  to  that  of  a chi-square  density  function  with  V degrees  of  freedom 

whenever  = 1. 
h 

Furthermore,  whenever  the  number  of  degrees  of  freedom 


V = 2WT  > 100  , 

2 

the  random  variable  x(H  ) can  be  closely  approximated  by  a Gaussian  density 
with  the  following  parameters  under  and  (Abramowitz  and  Stegun,  1964) 

Px|H 

P Ijj  (x|h^)  is  n(2WT[a^  + o^].  4VfT[a2  + a"]^)  . 


In  keeping  with  common  usage  (Swets,  1964),  define  a detectability 
index  as: 

r 

1/2 


d’  =/2 
opt 


I 


E(X|H^)  - E(X|H^) 


1 


(WT) 


l/2[Va 

1/2 


'r  (x|h^)  + Var  (X|H^)] 


(2.2) 


o^[l/2(o^/o^)^  + (o^/o^)  + 1] 

n s n s n 


For  the  small  signal-to-noise  case  where 

o*/o^  <<  1 , 

s n 
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Equation  2.2  reduces  to  the  usual  case  where  the  variance  under  the  two 
hypotheses  is  the  same.  Then 

d'  a^/a^  , 0^/0^  « 1. 

opt  s n ’ s n 

for  a forced  choice  test. 

2.4  Deciding  Between  Complex  Patterns 

In  the  preceding  sections  we  assumed  that  a minimutn  requirement  for 
distinguishing  between  two  classes  of  patterns  was  the  opportunity  to  detect 
the  distinguishing  features.  We  then  went  on  to  define  a measure  of 
detectability,  the  detectability  index  for  bands  of  noise.  An 

equivalent  measure  of  detectability  can  in  theory  be  written  for  all  features 
which  characterize  the  patterns.  Actual  marine  noise  sources  are  much  more 
complex  than  this,  however,  and  one  wonders  about  the  utility  of  these 
simple  models.  In  an  attempt  to  relate  the  problems,  below  is  an  ordering 
of  auditory  recognition  tasks  from  simple  to  operationally  realistic. 

a)  Distinguishing  between  two  patterns  differing  by  one 
dichotomous  feature. 

b)  Distinguishing  between  two  patterns  differing  by  a number 
of  dichotomous  features. 

c)  Distinguishing  between  two  pattern  classes  which  differ  by 
one  feature  which  can  exhibit  a continuous  range  of 
detectability  or  feature  parameters. 

d)  Distinguishing  between  two  pattern  classes  which  differ 
by  a number  of  features  with  ranges  of  detectabilities. 

e)  Deciding  to  which  of  N pattern  classes  a sound  belongs  when  ^ 
the  sound  has  a number  of  features  with  ranges  of 
detectabilities  or  parameters  for  each. 

We  can  see  that  the  problem  with  which  the  sonar  operator  is  faced  is 
either  d or  e above.  Whenever  the  operator  is  somehow  convinced  that  the  a 
priori  probability  of  targets  is  such  that  only  two  classes  of  patterns  are 
to  be  expected,  the  problem  is  simplified  but  still  is  a very  complex  one. 
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Also,  nothing  has  been  said  so  far  about  his  degree  of  knowledge  (learning) 
about  the  patterns  with  which  he  is  attempting  to  match  the  unknown. 

Whenever  a number  of  features  are  associated  with  a pattern,  it  can 
be  expected  that  some  features  will  occur  only  when  others  are  present. 

The  features  are  therefore  correlated  one  to  another  to  some  extent.  As 
an  example  of  this  idea,  consider  the  pattern  classes  consisting  of: 

a low  speed  surface  ship 

vs  a surface  ship  proceeding  at  a speed  high  enough  to  cause 

prop  cavitation. 

The  pattern  class  will  exhibit  a number  of  tonals  which  are  associated 
with  machinery  on  board.  Also,  the  sound  will  be  only  lightly  amplitude 
modulated  by  the  effect  of  waves  on  the  acoustic  coupling  to  the  water. 

As  the  ship  increases  speed,  some  tonals  will  shift  in  frequency  while  others 
remain  stable.  Also,  the  broadband  noise  will  increase  and  will  tend  to 
mask  the  tonals.  Once  prop  cavitation  occurs,  there  is  a speed  range  over 
which  the  broadband  sound  will  be  strongly  amplitude  modulated.  Other 
changes  also  occur  in  the  sound  depending  on  the  type  of  ship,  the  number  of 
screv's,  etc.  (Myasnikov  and  Myasnikova,  1971.)  How  do  we  attack  this  complex 
problem  in  an  efficient  manner? 

One  approach  is  to  present  members  of  the  two  pattern  classes  to 
listeners  under  various  listening  conditions  and  to  elicit  classification 
responses.  From  such  experimental  results  we  can  obtain  a measure  of 
operator  performance  for  a specific  set  of  pattern  classes  under  the  conditions 
tested.  This  method  has,  in  fact,  been  applied  to  a number  of  sonar  cases 
as  well  as  to  speech  perception  (National  Research  Council,  1949,  LickTider,  1951, 
Pollack,  1948.)  Based  on  a large  number  of  such  experiments,  we  can  perhaps 
extend  the  observed  results  to  classification  of  as  yet  untested  sources. 

Alternately,  we  can  treat  simple  cases  with  an  attempt  to  model  the 
process.  Hopefully  we  can  then  infer  the  performance  in  the  more  complex 
cases.  Certainly  some  experimental  verification  with  these  complex  sounds 
would  be  needed  if  the  model  is  to  be  useful.  Listed  below  are  some  of  the 
techniques  which  can  be  considered  in  the  extension  of  simple  concepts  to 
operationally  significant  ones. 
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a)  Inter-relationships  between  features  which  are  correlated  can 
be  reduced  to  second-moment  characterizations  in  terms  of 
uncorreiated  random  variables  using  the  Karhunen-Loeve  expansion 
(Van  Trees,  1968,  Fu,  1968.) 

b)  These  orthogonal  feature  eigenvectors  can  then  be  treated  using 
detection  theory  in  the  case  of  dichotomous  features  or 
estimation  theory  when  features  can  occur  over  a range  of 
detectabilities  or  parameter  values. 

c)  Classification  into  one  of  N pattern  classes  can  then  be  modeled 
using  the  concept  of  discriminant  functions  and  other  pattern 
recognition  procedures  (Fu,  1968,  Sholl,  1971.) 
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CHAPTER  III 

SEQUENTIAL  RECOGNITION 


3.1  Role  of  Criteria  and  A Priori  Knowledge  in  Sequential  Detection 

In  keeping  with  the  assumptions  made  in  the  foregoing  chapter,  we  will 
now  go  on  to  analyze  in  greater  detail  the  detection  of  a dichotomous 
feature,  Operationally  the  sonar  operator  is  faced  first  with  the 

problem  of  detecting  a target  and  then  with  the  classification  of  that 
target.  This  is  a sequential  detection  problem  in  both  the  initial 
detection  case  and  the  detection  of  the  feature 

At  any  time  t j , the  listener  is  faced  with  three  possible  choices: 

a)  On  the  basis  of  the  aural  stimulus,  decide  H , the  feature 

’o’ 

is  absent. 

b)  Given  the  aural  stimulus,  decide  Hj^,  the  feature  is  present. 

c)  Decide  to  continue  listening  because  there  is  not  sufficient 
evidence  to  decide  either  Hq  or  11^^. 

The  listener  will  choose  his  course  of  action  depending  both  on  the 
information  in  the  input  stimulus  and  on  his  mental  set  (Deeper,  1951.) 


To  gain  insight  into  the  effect  of  a priori  knowledge  and  response  criterion, 
consider  the  two  cases  below.  In  both  cases  the  two  pattern  classes  to 
be  decided  between  are  the  pattern  associated  with  a light  warship  and 
U)^,  the  acoustic  pattern  of  a tanker. 

a)  A modern  tanker  can  have  a draft  exceeding  100  feet  and  the 
submarine  is  required  to  transit  at  rather  shallow  depth 

in  proximity  to  a heavily  traveled  shipping  lane. 

b)  In  times  of  hightened  military  tension,  the  submarine  is 
to  maintain  covert  surveillance  of  all  military  surface 
ships  in  a strategic  sector. 

In  the  first  case,  the  penalty  for  failing  to  classify  the  tanker  could 
be  very  great,  whereas  misclassifying  the  warship  as  a tanker  would  only  lead 
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to  a minor  penalty  due  to  having  to  skirt  a potentiaJly  harmless  shallow 
draft  craft.  Also,  the  operator  would  be  anticipating  merchant  shipping. 
In  the  second  case,  the  situation  is  less  clear  cut  since  misclassifying 
a tanker  as  a warship  would  dilute  the  efforts  of  the  submarine.  Falsely 
dismissing  the  warship  would  also  be  a serious  error,  however. 

Whenever  the  a priori  probabilities  of  the  signals  are  known,  and  a 
cost  can  be  associated  with  the  available  courses  of  action,  the  Bayes 
criterion  can  be  used  to  minimize  the  total  risk  (Van  Trees,  1968.)  In 
the  sequential  detection  problem  we  then  have: 

^1’  ^2’  ^ priori  probabilities  of  respectively. 

> i>  j = where  C^j  is  the  cost  of  classifying  the  pattern  as 


C,  the  cost  of  deferring  the  decision  until  more  information  is 
det 

available. 

Alternately,  if  the  a priori  probabilities  are  not  known,  or  if  it  is 
not  reasonable  to  assign  costs  to  each  of  the  courses  of  action,  it  may  be 
more  appropriate  to  set  a criterion  which  reduces  the  probability  of,  say 
a false  classification  as  below  some  limit.  The  threshold  adopted 
in  this  case  will  be  chosen  using  the  Neyman-Pearson  criterion. 

Whatever  the  criterion  by  which  the  thresholds  for  the  available  courses 
of  action  are  determined,  the  observed  performance  will  be  strongly  influenced 
by  this  threshold.  A way  to  characterize  this  performance  under  various 
criteria  is  by  means  of  a receiver  operating  characteristic  (ROC)  curve. 

This  curve  presents  the  probability  of  a detection  P(D)  vs  the  probability 
of  a false  alarm  P(FA).  As  the  listener  adjusts  his  criteria,  both  of  these 
probabilities  will  change. 

Under  a lax  criterion,  i.e.,  say  whenever  there  is  any  evidence  of 
the  presence  of  feature  the  number  of  correct  detections  will  be  high  as 
will  the  probability  of  a false  alarm.  A more  strict  criterion  will  decrease 
both  of  these  probabilities. 
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1 


3.2  Factors  Influencing  the  T erminal  nucision 

Whenever  a decision  to  accept  hypothesis  or  11^  is  made,  we  refer 
to  that  as  a terminal  decision.  How  is  such  a terminal  decision  made? 

The  initial  target  detection  has  been  treated  elsewhere,  and  will  not  be 
reiterated  (Loeb,  1970,  Boehme,  1970.)  But,  knowing  that  a source  is 
present,  the  listener  must  now  decide  about  the  presence  of  a specific  feature. 

If  we  assume  for  a moment  that  in  each  detection  opportunity  the 
decisions  of  and  both  have  finite  probabilities,  it  is  only  a question 
of  time  until  a terminal  decision  will  occur  by  chance  alone. 


Wald  (1947)  extended  the  concept  of  the  likelihood  ratio  test  to  the 
sequential  detection  problem.  The  sequential  probability  ratio  test  is, 
at  the  i'th  stage 


B < (R)  = IT 


(rJh,)  H 

< A 


i r ' j I 1-  ^o 


j=l  "r|H^(^jlV 


else  defer  the  decision. 


A < B 


Here  A and  B are  the  thresholds  for  deciding  and  11^^  respectively.  He  goes 
on  to  prove  that  such  a test  will  always  terminate  and  he  derives  the  expected 
number  of  observations  for  a terminal  decision. 

In  the  passive  sonar  case  we  must  be  very  specific,  however,  in  the  way 
we  define  a detection  opportunity  or  observation.  Furthermore,  the  above 
development  assumes  that  the  probability  densities  under  11^  and  H^  remain 
unchanged  from  observation  to  observation.  The  actual  detection  occurs  in 
a situation  where  the  source  to  receiver  range,  hence  signal-to-noise  ratio 
and  probability  densities,  change  as  a function  of  time.  The  criteria  and 
associated  threshold  are  also  likely  to  change  in  the  real-world  situation. 

If,  for  example,  the  time  required  to  come  to  a decision  is  very  long,  the 
criterion  may  be  relaxed  for  the  sake  of  arriving  at  a decision  so  that  another 
target  can  be  Investigated. 

3.3  Forced  Response  vs.  Sequential  Classification 

Classifications  which  imply  only  the  detection  of  one  dichotomous  feature 
may  be  conveniently  visualized  by  the  probability  densities  under  and  11^. 


£ 
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Whenever  the  listener  must  either  make  a yes-no  (YN)  or  a forced  choice, 

2 alternative  (27VFC)  response  a single  threshold  divides  the  decision  space 
into  two  distinct  regions.  Figure  2.1a  shows  the  probability  densities  and 
error  regions  for  this  situation  for  the  detection  of  a band  of  noise  in 
noise.  The  error  regions  are: 

CO 

a = P (deciding  H^Ih^)  = P(FA)  = /^  Pj^|h  dx 

e = P(deciding  H^j}|^)  = 1 - P(D)  = dx. 

The  error  regions  at  time  t^  for  sequential  detection  are  shown  in 
Figure  2.1b.  The  probability  of  an  error  in  classification  will  be  given  by: 

P(E)j_  = p(H^)  PCxln^,)^  + PCxIiVj. 

= Oj,  P(II^)  + 3^  P(11_^) 

where  ^ 

“t  ■ Px|n„  PxlH^ 

In  practice,  the  criterion,  hence  threshold,  will  be  set  by  the  listener 
in  response  to  the  situation.  T]ie  numeric  value  of  the  threshold  Y will  not 

be  known  but  in  the  forced  response  case  can  be  inferred  from  the  observed 
values  of  P(FA)  and  P(D).  The  sequential  detection  and  classification  case 
presents  some  difficulties,  however.  Especially  when  the  distributions  under 
and  are  changing,  knowledge  of  and  3^.  is  itself  not  adequate  to 

define  Y.  and  Yt,  without  assumptions  of  the  terminal  decision  process. 

Pi  15 

The  points  made  in  this  chapter  may  appear  of  only  theoretic  interest. 
However,  we  will  see  that  these  are  central  issues  in  the  design  of  psycho- 
acoustic experiments  of  aural  classification. 
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Figure*  2.1  Error  Regions  for  Forced  Response  and 
Sequential  Detection 
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CHAPTER  IV 

DESIGN  OF  AN  EXI’ERIMKNT  IN  AURAL  RIXOGNITION 


Classically,  the  basic  correlates  of  the  auditory  stimulus  have  been 
investigated  by  threshold  metiiods  (Boring,  1950.)  In  this  approach,  the 
stimulus  parameter  to  be  measured  is  changed  slowly  until  a just  perceptible 
difference  in  the  physiol ogi cal  response  occurs  (Licklider,  1951.)  Such 
techniques,  with  variations  in  details  of  implementation,  have  been  used  to 
investigate  the  equal  loudness  of  various  spectrally  shaped  noises,  for 
example.  The  response  was  one  of  "louder  than"  or  "not  as  loud  as"  a 
reference  band  of  noise  (Cremer,  Plenge,  and  Schwarzl,  1959,  Zwicker,  1958, 
Zwislocki,  1969.) 


The  results  of  these  threshold  experiments  can  be  explained  using  the 
theory  of  signal  detectability.  It  can  be  inferred  that  the  listener  operates 
in  a sequential  detection  or  recognition  environment.  Wienever  the  established 
criterion  is  exceeded  during  a period  of  increasing  stimulus  intensity,  the 
threshold  can  be  thought  to  have  been  exceeded.  Also,  many  of  these  experiments 
alternately  increase  and  decrease  the  stimulus  intensity  with  the  aim  of 
"bracketing"  tiie  threshold  value. 


It  should  be  pointed  out,  however,  that  single  or  multiple  thresliolds 
for  sensural  inputs  have  been  postulated  as  explaining  tlie  observed  results. 
Also,  quantization  of  the  auditory  process  has  been  suggested  (Licklider,  1951.) 
While  we  are  not  concerned  here  with  the  analysis  of  detailed  mechanisms 
which  are  active  in  the  human  organism,  there  are  basic  differences  in  approach 
between  experimenters  using  threshold  techniques  and  those  testing  various 
aspects  of  the  signal  detection  theory.  These  differences  in  technique  are  of 
considerable  importance. 


Detection  theory  related  experiments  are  designed  to  elicit  either  data 
about  the  ROC  curve  or  the  psychometric  function  of  the  listener.  The 
psychometric  function  is  the  variation  of  percent  correct  detection  or 
classification  as  a function  of  stimulus  level.  The  key  parameters  are  usually 
the  various  probabilities  of  the  available  courses  of  action  and  the 
detectability  expected  on  the  basis  of  some  signal  detection  model.  The 

detectability  Is  usually  defined  as  tlie  normalized  difference  of  means  of  a 
statistic  under  two  alternate  hypotheses. 
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The  signal  detection  theory  types  of  experiments  present  the  stimuli 
to  a listener  at  some  signal-to-noise  ratio  and  observe  the  performance. 

The  criterion  is  usually  specifically  included  by  way  of  instructions  prior 
to  tlie  test.  Data  are  taken  at  a number  of  signal-to-noise  ratios  or 
other  fixed  values  of  a parameter  (Svets,  1964.)  The  responses  are  either 
"yes"  the  signal  was  present  in  a trial  period,  or  "no"  it  was  not,  a 
YN  type  experiment,  or  a forced  choice  N alternative  NAFC  experiment  where 
the  listener  indicates  in  which  one  of  N trial  periods  the  signal  occurred. 
Variations  of  these  methods  are  also  used,  and  the  signals  are  usually 
presented  in  the  presence  of  a noise  at  some  preselected  level. 

The  latter  experimental  technique,  the  YN  or  NAFC  tests,  has  the 
advantage  that  the  results  are  subject  to  well  known  and  easily  applied 
statistical  hypothesis  testing  techniques  as  arc  described  in  Chapter  II. 

4.2  A Pilot  Study  and  its  Impact 

Initially,  a pilot  study  was  set  up  using  university  students  as 
subjects  to  identify  weaknesses  in  experimental  techniques  and  to  obtain 
bounds  on  listener  performance.  This  was  an  identification  task  in  which 
the  listener  was  asked  to  decide  which  of  two  trained  marine  sounds  (the 
exposure  set)  was  subsequently  presented  in  a broadband  noise  background. 

The  signals  were  presented  binauraliy  from  tape  recordings  in  an  experimental 
arrangement  very  similar  to  that  which  is  described  in  later  sections. 

However,  for  these  initial  tests  the  unkno\^m  or  probe  stimulus  was  presented 
for  a 20  second  period  of  time  at  a fixed  signal  to  noise  ratio.  The  listener 
was  asked  to  record  on  an  answer  sheet  his  or  her  assessment  of  wliich  member 
of  the  exposure  set  the  probe  corresponded  to.  In  addition,  the  third 
alternative  "don't  know"  was  to  be  indicated  whenever  "reasonable"  doubt  existed. 

Because  the  sounds  presented  do  not  fall  witliin  the  range  of  normal 
auditory  experience  for  these  listeners,  an  immediate  area  of  difficulty  arises. 
We  are  precluded  from  using  techniques  applicable  to,  for  example,  speech 
intelligibility  studies  where  the  subjects  can  be  assumed  to  have  a common 
learned  ability  to  distinguish  between  signals  (Pollack,  1948).  A time 
consuming  stimulus  training  period  is  required  prior  to  any  probe  measurement. 
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Also,  since  the  stimuji  cliaracteristics  cannot  bo  reliably  verbalized,  an 
arbitrary  designation  of  A and  B were  associated  with  the  trained  stimuli. 

Such  arbitrary  association  of  letters  with  the  members  of  the  exposure  set 
is  unfortunate.  However,  the  methods  used  by  Pollack  (1959)  to  circumvent 
this  shortcoming,  the  method  of  recognition  memory,  cannot  be  used  in  our 
case.  In  that  method,  the  probe  is  chosen  from  an  augmented  set  which  includes 
the  members  of  the  exposure  set  and  members  of  another  set,  the  confusion 
set.  The  listener  response  then  consists  of  an  assessment  of  whether  the 
probe  in  fact  corresponded  to  a member  of  the  exposure  set.  The  marine 
sounds  are  such  that  the  experimenter  cannot  know  if  the  confusion  signals 
will  not,  in  fact,  sound  "like"  one  of  the  exposure  signals.  The  listener 
cannot  establish  a mental  "measure"  of  the  dissimilarity  of  signals  in  this 
case . 

A test  event  consists  of  an  exposure  (learning)  period  for  the  two 
signals  without  interfering  noise.  After  a short  pause,  the  probe  is  presented 
in  the  interfering  noise.  Subsequently,  the  exposure  set  is  repeated  in  a 
refresli  period,  or  a new  set  of  signals  is  exposed  for  learning.  No  feedback 
was  used  except  that  in  souie  cases  the  subjects  were  appraised  of  their 
performance  at  tlie  end  of  a test  session.  Each  test  event  requires  about  two 
minutes  to  complett  for  a total  of  15  to  18  events  per  session. 

Little  was  known  at  t lie  time  of  this  pilot  study  about  the  range  of 
signal-Lo-no j se  ratios  at  which  the  listener  performance  would  attain  some 
pre-determined  probability  of  correct  response  P(C).  It  was  necessary,  there- 
fore, to  present  the  probe  at  a number  of  values  of  signal-to-noise  ratios. 

Using  four  signal  pairs  of  interest  and  five  values  of  signal-to-noise  ratio, 
there  were  20  performance  indices  to  be  estimated.  As  will  be  shown  later,  one 
would  prefer  at  least  50  data  points  per  performance  index  for  a total  of 
1000  events  or  about  60  sessions. 

The  availability  of  trained  listtmers  willing  to  participate  in  these 
studies  was  limited,  and  not  all  performance  indices  wore  adequately  studied. 
This  method  was  abandoned  in  favor  of  the  ramped  signal-to-noise  ratio  technique 
discussed  in  subsequent  sections  of  this  report  primarily  because  a faster 
method  was  required.  This  early  study  was  very  useful  in  identifying  several 
weaknesses  in  procedures  and  equipment. 
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• 3 A Mod i f le d Thi'eshold  Procedure 

Most  of  the  results  presented  In  tills  report  were  collected  using  a 
modification  of  threshold  procedures.  The  scheme  used  is  modeled  after  tlie 
techniques  used  by  Cremer  (1959)  and  Zwicker  (1957)  to  determine  the  loudness 
of  various  bands  of  noise.  Some  considerations  which  strongly  influenced 
the  experimental  techniques  arc  listed  below. 

a)  Because  it  was  desired  at  the  outset  to  make  the  listener-related 
portions  of  the  system  portable  so  as  to  be  useable  at  the  sonar 
schools,  the  signals  are  pre-recorded  on  audio  tape.  Also,  the 
response  recorder  is  of  necessity  simple. 

b)  Responses  needed  to  be  recorded  automatically  without  the  test 
conductor  needing  to  be  present  because  of  other  demands  on  the 
author  which  required  considerable  periods  of  absence  from  the 
University. 

c)  The  process  of  data  reduction  was  to  be  as  simple  as  practical 
so  that  no  one  individual  would  be  overly  committed  to  this 
portion  of  the  effort. 

in  the  modified  threshold  procedure,  the  probe  is  initially  presented 
in  a broad  band  noise  background  at  a low  signal- to- noise  ratio.  This 
signal-to-nolse  ratio  (SNR)  is  then  increased  linearly  with  time  until  such 
time  as  the  listener  can  make  a determination  of  which  exposure  signal  the 
probe  corresponds  to.  This  method  has  an  advantage  in  its  similarity  to  the 
sequential  classification  task  faced  by  a sonar  operator  in  a real-world 
decreasing  range  situation.  The  results  only  provide  data  about  the  probability 
of  a correct  (or  wrong)  classification  whenever  the  listener  has  sufficient 
confidence  to  make  a terminal  decision,  however. 

The  sequence  of  occurrences  during  each  test  event  is  shown  in  Figure  4.1. 
The  exposure  signals  A and  B are  chosen  with  randomization  from  the  set  of 
auditory  patterns  of  interest.  Generally,  these  patterns  differ  by  one  or  more 
features.  The  initial  and  refresh  exposures  are  of  fixed  duration  which  the 
listeners  generally  agreed  on  as  sufficient  for  later  recognition.  The  probe 
SNR  is  always  low  initially  but  this  value  is  randomized  to  avoid  listeners 
responding  on  the  basis  of  time  instead  of  the  perceived  aural  response. 
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Figure  4.1  Overall  Timing  Diagram  for  an  Event  of  the 
Modified  Threshold  Procedure 
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Responses  were  recorded  by  listeners  by  pressing  either  a switcli  marked 
A or  one  marked  li.  Once  the  listener  indicated  a decision,  the  auditory  signal 
was  blanked  for  the  remainder  of  the  response  period.  The  duration  of  the 
response  period  was  also  randomized.  By  not  knowing  wlien  the  response  period 
Would  end,  the  listeners  were  in  effect  assigned  a cost  for  continuing  in  the 
listening  state.  Initially,  it  was  planned  to  compensate  the  listeners  on 
the  basis  of  performance.  The  payoff  would  have  been  computed  on  the  basis  of 
correct  classifications  with  a penalty  for  wrong  answers,  and  no  payoff  for 
events  where  no  classification  was  made.  The  use  of  trained,  expert  listeners 
obviated  the  need  for  this  gaming  approach  because  the  level  of  motivation 
was  high  and  the  level  of  performance  could  be  established  by  verbal  instructions 
alone . 

The  pre-recorded  signals  were  played  back  using  a Crown  700  tape  player 
and  a pair  of  Telephonies  Model  TDH-39  dynamic  earphones.  The  signal  was 
fed  in  phase  to  a matched  set  of  these  headphones  and  the  listener  was  further 
acoustically  isolated  by  being  enclosed  in  an  audiometric  booth.  Figure  4.2 
diagrams  the  equipment  needed  at  the  listener  location.  A Sony  cassette 
recorder  is  used  for  recording  the  responses  which  are  coded  as  frequencies. 

Also  recorded  onto  this  cassette  is  a tone  whlcii  is  proportional  to  the  SNR. 

This  tone  and  otlier  signals  needed  to  properly  sequence,  the  response  equipment 
are  recorded  on  one  channel  of  the  two-channel  Crown  tapes.  The  exposure,  probe 
and  interfering  noise  signals  are  recorded  on  the  other  channel  of  these  tapes. 
The  equipment  needed  at  the  site  of  the  tests  is  pictured  in  Figure  4.3. 

To  analyze  the  data,  the  cassette  tapes  were  played  back  on  another  cassette 
tape  unit.  The  SNR  related  tone  and  the  response  tones  were  then  output  on  a 
cash  register  tape  by  means  of  a counter  and  printer  combination.  Manual  methods 
were  used  to  determine  the  SNR  at  the  moment  when  the  listener  responded.  Also, 
the  response  (A  or  B)  and  the  time  from  the  beginning  of  the  response  period 
were  recorded  on  a data  form.  As  the  number  of  data  points  accumulated,  the 
results  were  key  punched  and  copied  onto  a computer  magnetic  tape  for  retention 
and  accessability . Computer  software  to  perform  various  statistical  analyses 
on  the  data  has  also  been  developed. 

4.4  Listener  Safeguards  and  Calibration  Techniques 

Prior  to  beginning  experiments  using  human  subjects,  approval  was  obtained 
In  compliance  with  the  Institutional  Assurance  provisions  of  The  Pennsylvania 
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Figure  4.2  Summary  of  Equipment  of  the  Test  Site 
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State  University.  For  approval,  a prospectus  of  tlie  study  was  submitted. 

This  proposed  approach  was  reviewed  by  a University  Committee  to  ascertain 
that  no  danger  to  the  listeners  would  result.  Specifically,  the  levels 
and  durations  of  signals  are  such  that  no  damage  risk  is  incurred  (American 
Academy  of  Ophthalmology,  1957.)  Facli  subject  was  instructed  as  to  his  role 
as  participant  in  the  study,  and  it  was  pointed  out  that  under  no  conditions 
would  the  sounds  be  so  loud  as  to  cause  discomfort. 

In  order  to  eliminate  the  effect  of  shift  in  subjective  loudness  as 
the  probe  signal-to-interf er ing-noise  ratio  changed,  a balanced  mixer 
followed  by  an  automatic  loudness  control  was  used.  The  loudness  as  presented 
to  the  subject  was  carefully  maintained  at  a loudness  level  of  65  phons  (GD) 
(ISO  R532.)  This  loudness  level  was  verified  from  l/3rd  octave  band 
measurements  of  the  voltage  function  to  the  head  phones  and  taking  into  account 
the  factory  provided  earphone  calibration  with  the  MXAl/AR  ear  cushions. 

This  loudness  level  applied  to  the  exposure  signals  as  well  as  the  probe 
signal  in  noise. 

The  SNR  was  controlled  by  a balanced  mixer  with  step  increments  of  about 
1/2  dB  SNR  per  step.  The  shaft  of  this  mixer  is  connected  to  a potentiometer 
which  is  used  to  vary  the  voltage  of  a voltage  controlled  oscillator.  The 
oscillator  frequency  as  a function  of  SNR  of  the  output  of  the  mixer  was 
calibrated  at  intervals  of  about  1/2  dB  and  it  was  this  calibration  data  which 
was  used  In  the  subsequent  data  reduction  steps.  The  combined  frequency 
uncertainty  due  to  flutter  and  wow  in  the  Crown  tape  player,  the  Sony  cassette 
recorder,  and  the  final  cassette  player  was  less  than  + l/4th  of  the 
frequency  shift  resulting  from  a change  of  SNR  of  1/2  dB. 

Each  of  the  two  exposure  signals  was  analyzed  and  plotted  against  the 
interfering  noise  background  using  l/3rd  octave  analysis  accurate  to  1/2  dB. 
This  calibration  was  performed  with  a predetermined  setting  of  the  SNR  mixer 
setting  (0). 
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CHAPTER  V 

experiid;nts  in  auditory  pattern  discrimination 

5 . 1 Introdiict  ion 

A total  of  seven  trained  university  students  were  used  as  listeners 
in  experimental  sessions  spanning  about  six  months.  These  students  were 
concurrently,  or  had  been,  participants  in  other  psychoacoustic  studies 
at  The  Pennsylvania  State  University.  There  were  three  advanced  standing 
female  students  and  four  graduate  school  males  with  normal  hearing  as 
verified  by  current  audiograms.  These  listeners  were  free  to  administer  test 
sessions  on  their  own  at  their  convenience  witli  the  following  restrictions: 

a)  No  two  test  sessions  could  run  sequentially. 

b)  No  more  than  two  test  sessions  could  be  held  on  one  day. 

For  all  results  presented  in  this  chapter,  the  listeners  used  the 
modified  threshold  procedure.  Also,  they  were  instructed  during  the 
initial  Interviews  and  by  voice  comments  on  the  tapes  to: 

"Indicate  your  choice  as  to  which  signal  is  being  presented  in 
the  noise  only  when  reasonably  certain  of  your  decision."  The  listeners  were 
paid  a fixed  amount  for  each  session  they  participated  in.  No  feedback 
as  to  the  number  of  correct  classifications  was  given  unless  early  results 
indicated  either  a tendency  to  guess  at  a very  low  SNR  or  a too  conservative 
criterion  which  resulted  in  the  probability  of  correct  decisions  being 
obviously  higher  than  that  obtained  by  other  subjects.  The  results  of  these 
early  results  have  been  discounted  from  the  data  presented. 

In  the  analysis  of  the  results  which  follows,  the  term  treatment  is  used 
to  indicate  a particular  exposure  set/probe  combination.  Hence,  a 
situation  where  the  signal  pattern  is  associated  with  exposure  signal 
A,  signal  pattern  u)^  with  exposure  signal  B,  and  signal  pattern  is  used 
as  the  probe  signal  in  an  ocean  ambient  noise,  is  a specific  treatment.  The 
results  of  several  treatments  may,  however,  be  combined  (pooled)  in  some 
of  the  presentations.  The  statistical  analysis  consists  of  testing  the  effect 
of  a treatment  applied  to  the  population  consisting  of  the  seven  listeners. 
Except  where  specifically  noted,  no  distinction  is  made  between  the  subjects 
comprising  the  population.  This  method  of  presentation  is  tantamount 
to  assuming  that  there  are  no  individual  differences  (Murdock,  1968.) 


-32- 


May  28,  J975 
CPJ  ;clb  :cjg 


Wlionever  data  from  magnetic  tapes  was  used  as  input,  a viiitening  filter  was 
used  to  make  the  background  noise  nearly  a broadband  wliite  noise.  This 
whitening  was  used  to  reduce  the  overall  signal  dynamic  range  and  is 
consistent  with  operational  practice  in  sonar  equipments. 

5.2  Ef fects  of  Jfiy,h_  Pass  Filtering 

In  studies  of  the  intelligibility  of  speech  in  noise,  it  is  found  that 
a correspondence  can  be  established  between  the  cutoff  frequency  and 
listener  performance  (Pollack,  1948.)  There  seems  to  be  a definite 
relationship  between  performance  in  various  audiomctric  tests  and  the 
trained  speech  recognition  process  (Stevens  and  House,  1972.)  For  this  reason, 
one  of  the  principal  areas  of  investigation  centered  around  the  effect  of 
high  pass  filtering  on  the  classification  performance  against  marine  sounds. 

In  this  study  there  were  16  treatments  involving  four  magnetic  recordings 
of  surface  shipping.  In  all  cases  the  exposure  set  consisted  of  such  a 
recording  and  an  identical  signal  which  was  high  pass  filtered  with  a 
sharp  filter  with  3 dB  down  point  at  707  Hz.  The  frequency  was  chosen  to 
correspond  to  an  octave  band  edge  for  octave  30  complying  witli  USA  Standard 
SI. 6-1967.  Tabic  5.1  summarizes  the  treatments.  In  table  5.2  the  significant 
aural  cliaracteristics  of  the  source  is  given.  It  can  be  seen  that  the  four 
ship  sounds  represent  a considerable  range  in  characteristics. 

On  tiie  basis  of  considerations  presented  in  a later  section,  tlie  data 
for  treatments  differing  only  in  the  order  of  the  exposure  signals  were  pooled. 
The  difference  in  l/3rd  octave  spectral  levels  between  the  interfering  noise  and 
the  probe  signal  is  shown  in  Figures  5.1  and  5.2.  The  probe  signal  in  each 
case  Is  shown  at  a relative  level  corresponding  to  the  point  estimate  of  SNR 
required  for  listener  response. 

It  can  be  seen  from  these  data  that  the  difference  in  SNR  to  respond  to 
probe  signals  w.  . and  is  in  all  cases  less  than  1 dB.  This  difference 

is  not  statistically  significant  when  tested  by  the  t test  for  difference  of 
means.  We  may,  therefore,  pool  all  treatments  which  involve  the  same  exposure 
signals  in  any  order. 


the  probability  of  a correct  classification  for  pooled  treatments,  and 
Pj j (F) . the  probability 
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1. 

14 
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13 

I. 

15 

‘"l4HP 

‘"l4 

"’l4 

8 ^ 

1. 

16 

‘"l4HP 

<"14 

'"l4HP 

2 

* (i).  . is  signal  pattern  (i).  . .high  pass  filtered  with  a filter  with 

nr 

cutoff  frequency  of  707  Hz. 


Table  5.1.  Summary  of  Treatments  for  Studies 

of  the  Effects  of  High  Pass  Filtering 
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SOUND 

PATTERN 

DESCRIPTION  OF  ITS  SOUND 

e 

s 

High  speed  rhythmic  sound  with  pronounced  bursts  of  high 
frequencies  but  of  short  duration.  Sound  is  best  described  as 
that  of  a number  of  horses  galloping  across  a concrete  bridge. 

‘^11 

Very  regular  rhythm  like  that  of  a steam  engine  or  slow 
train  as  heard  inside.  St>me  irregular  tiigh  frequency  popping. 

^2 

Noise-like  without  any  noticeable  rhythm.  Sound  like 
that  of  high  speed  air  but  at  the  same  time  having  a low 
rumbl ing  component . 

Noise-like  with  some  Irregular  high  frequency  sounds  like 
water  splashing.  Al.so,  has  a very  pronounced  steady  tone  in 
octave  band  34  around  . A hint  of  other  tones  wliich  wax 

and  wane. 

Table  5.2. 


Sunnnary  of  .Sounds  Associated 
Patterns  0)^^, 


wi  th 
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of  an  incorrect  classification  for  the  pooled  data.  Ihe  number  of  no-response 
situations  was  small  for  this  group  of  data  and  any  events  which  did  not  end 
in  a terminal  decision  for  any  subject  of  session  were  eliminated  from  con- 
sideration in  any  session  by  any  subject.  This  handling  of  the  data  was  deemed 
adequate  even  though  it  did  reduce  the  total  number  of  events  useful  in  the 
analysis.  Had  the  number  of  no-responsos  been  greater,  the  situation  would 
have  been  han.dled  differently  by  the  application  of  statistical  concepts  valid 
for  marginal  distributions  (Winer,  1962.) 

The  results  for  the  pooled  data  are  sliown  in  Table  5.3.  These  are  the 
point  estimates  for  the  SNR  required  for  a terminal  decision  under  the  criterion 
of  "reasonably  certain."  With  each  set  of  exposure  signals  w.  . and  there 

is  associated  an  observed  probability  of  correct  decision.  This  too  is  a point 
estimated  of  the  underlying  probability  governing  the  likelihood  of  a specific 
outcome  to  the  experiment  each  time  it  is  performed  (Swets,  196A.)  Assuming 
that  the  pooled  events  comprise  Bernoulli  trials  with  equal  probability  of 
occurrence,  the  binomial  distribution 

f(x)  ~ ^ . X = 0,1,2. ..n 

= 0 elsewhere 

afiplies  for  >:  observed  correct  classifications  in  n trials  (Hogg  and  Craig, 

1970.)  Again  this  approacli  assumes  no  individual  differences  between  listeners, 
and  no  order  effects,  and  no  differences  attributable  to  tlie  probe  used.  Murdock 
(1968)  shows  that  this  binomial  distribution  may  be  transiormed  to  an  approxi- 
mately Gaussian  distribution  using  the  transformation 

0=2  arc  sin  /T/n  . 

The  variance  of  the  transformed  variable  will  be: 

= 1/"  • 

ilence,  0 is  n(0^,  1/n)  where  0^  is  the  transformed  sample  probability  of 
correct  classification. 


_ — -zj 'm  JLJ.trr^- — .. 
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90  I’ER  CENT 

CONEIDEN'CE 

MEAN  SNR 

STANDARD  DEVIATION 

OBSERVED 

1 INTERVAl. 

PATTERNS 

SNR 

‘‘^SNR 

P(c) 

ON  SNR 

ON  P(c) 

'lO’  lOHP 

2.99  db  1 

2.81  db 

0.67 

2.35, 

0.62, 

3.63 

0.72 

'l  r 11  UP 

2.41  db 

3.A8  db 

0.80 

1.  A9, 

0.  7A, 

3.33 

0.85 

12’  12HP 

A. 80  db 

3.20  db 

0.59 

A. 15, 

0.55, 

5.A5 

0.62 

lA’  lAUP 

A. 92  dl, 

3.81  db 

0.70 

1 

3.58, 

0.62, 

6.26 

0.77 

Table  5.3.  Summary  of  Results  Pooled  Across 
Order  and  Probe 
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Till-  90  per  ci’iil  confidence  interval  on  the  mean  SNR  wlien  the  sample 
standard  deviation  is  used  is: 

[SNR  - t(0.95,v)  ^/v^T  < < s'nR  + t(0.95,v)  ^//IT] 

where  t(0.95,v)  is  the  single  tail  t statistic  at  the  95  per  cent  level  with 
V degrees  of  freedom  (Weinberg  and  Schumaker,  1962.)  In  the  limit  of  a 
large  number  of  degrees  of  freedom 

t(0.95,v)  =1.69  , V > 33. 

The  90  per  cent  conf idence  interval  on  the  probability  of  correct 
classification  is  given  approximately  by: 

[0  - Z(0.95)  ^//'n"  < y < 0 + Z(0.95)  ^//IT] 

o o o 

where  Z(0.95)  is  the  Z score  corresponding  to  the  one  tailed  normal 
distribution  (Murdock,  1968.)  Also, 

Z(0.95)  = 1.69. 

These  confidence  intervals  are  also  given  in  Table  5.3.  Note  that  the 
confidence  interval  on  the  probability  of  a correct  decision  is  quite  large 
(~  + 5 per  cent)  even  when  a large  number  of  events  are  pooled.  Figure 
5.3  shows  the  behavior  of  this  confidence  interval  for  the  two  cases  v;here 
the  actual  probabilities  of  tlie  occurrence  of  a correct  decision  are  0.7 
and  0.5  respectively.  It  is  seen  that  the  number  of  events  must  exceed 
about  seventy  for  the  90  per  cent  confidence  interval  to  be  within  + 10 
per  cent  of  this  value.  The  number  of  events  needed  to  estimate  this 
parameter  is  therefore  greater  than  50  and  preferably  70  if  the  confidence 
interval  is  to  be  at  all  useful.  Workers  using  the  forced  response 
techniques  routinely  use  300  or  more  events  to  estimate  each  data  point 
(Swets,  196A,  Green  and  Swets,  1966,  Green,  1960a) 

Another  product  of  this  research  is  the  observation  of  the  types  of 
errors  made  by  the  listener.  Table  5. A summarizes  these  results.  It  is 
seen  that  the  types  of  ralsclassif Icat ions  are  not  strongly  biased,  and  these 
data  lend  to  support  the  conclusion  that  the  listener  is  equally  likely  to 
mlsclassify  o).  . as  1°  mlsclassify  as  cj. . . 
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NUMBER  OF  EVENTS 
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OHSEKVED 

PROHABJEITIES 

a 

6 

0.19 

0.14 

0.07 

0.13 

0.18 

0.22 

0.17 

0.13 

Tabic  5.^.  Misclassif ication  Probabilities 

for  High  Pass  Filtering  Experiments 
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^.2 Tf^'ora  1_  aiid  Ef  f ect.s 

Again  referring  to  tlie  findings  from  speech  intelligibility,  the  intensity, 
frequency,  and  time  structure  of  a signal  would  be  expected  to  be  important 
(Lickliticr  and  Miller,  1951).  Two  sets  of  experiments  were  performed  to  address 
this  question.  The  first  of  these  compared  the  classification  performance  wlien 
tlie  exposure  set  consisted  of  two  signals  wliich  were  spectrally  similar  but 
differed  in  the  signal  envelope  structure.  The  second  set  of  experiments  in- 
vestigated the  effect  of  "soft"  limiting  of  the  signal. 

A total  of  five  treatments  were  tested  in  whicli  tlie  patterns  were  a re- 
cording of  a marine  sound  (pattern  class  to..)  and  a spectrally  similar  shaped 

Gaussian  noise  (pattern  class  w. These  treatments  are  summarized  in 

bN 

Table  5. ‘Si.  The  shaped  noise  was  obtained  by  passing  the  output  of  a broadband 
noise  generator  through  a General  Radio  multi-filter  with  the  l/3rd  octave 
weigiits  adjusted  to  give  a resulting  spectrum  which  by  l/3rd  octave  analysis 
differed  by  no  more  than  1 db  in  any  band  from  the  recorded  marine  source  spec- 
tral level . 

The  sh.'iped  noise  differed  from  the  recorded  source  both  in  its  amplitude 

vs.  time  behavior  and  in  its  narrowband  spectral  structure.  Sound  pattern 

(i>iy  is  clinracter ized  by  a pronounced  envelope  modulation  but  has  a relatively 
smooth  averaged  narrowband  spectrum.  In  contrast,  sound  pattern  has 

little  systematic  variation  of  level  with  time  but  has  a tonal  quality  due  to 
strong  narrowband  components.  When  the,  data  are  pooled  across  order  and  probe, 

the  resulting  SNR  and  P(C)  are  as  shown  in  Table  5.6. 

To  test  the  effect  of  limiting  on  signals,  the  tape  recorded  signal  was 
passed  through  a non-linear  circuit  with  the  transfer  characteristic  shown  in 
Figure  5.5.  The  input  level  was  so  adjusted  that  limiting  began  at  the^  lo 
point  of  the  time  waveform.  Such  a transfer  function,  while  modifying  the  de- 
tailed amplitude  vs.  time  structure  of  the  signal,  docs  not  greatly  reduce  the 
Intelligibility  of  speech  (Lickllder,  Blndra  and  Pollack,  1948), 

Five  treatments  using  two  marine  source  recordings  were  Investigated,  Of 
the  55  events,  only  18  resulted  in  a terminal  decision  and  of  these  55  per  cent 
were  correct  classifications.  The  treatments  and  number  of  events  tested  in 


each  case  arc  summarized  in  Table  5.5.  The  responses  and  listener  comments 
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r ' 'treatment 

A 

' 11 

PROllE 

NO.  OF  EVENTS 

ANALYSIS  011,1  EOT  IVES 

11. 

1 

‘^'lO 

^“^lOSN 

‘^lOSN 

5 

To  investigate  the  effect 
of  replacing  a recorded 

11. 

2 

‘"^lOSN 

‘^10 

‘"10 

18 

source  by  a spectrally 
similar  sliaped  noise 

II . 

3 

“>14 

‘"USN 

‘'^14 

11 

(01.  . .g^). 

11. 

4 

“’l4 

‘‘'l4SN 

^14SN 

17 

11. 

5 

‘'^14SN 

^14 

17 

III. 

1 

‘^lO 

‘^lOlo 

“10 

11* 

To  ascertain  if  lo  limiting 
of  a recorded  source 

111. 

2 

"’lO 

O 

c 

3 

^lOlo 

11* 

(o).  . ^ discern- 

— 

— — 

... — — 

— — 

10* 

able  effect  on  the  signal. 

111 . 

3 

^4 

‘''14 10 

^4 

111. 

4 

‘^14 

“’l41o 

“l41o 

11* 

111. 

5 

‘"l41o 

"14 

^^1410 

10* 

* No.  of  events  attempted. 


Table  5.-5.  Summary  of  Treatments  Investigating 
Temporal  and  Tone  Effects 


PATTERNS 

MEAN  SNR 
‘SKR 

STANDARD  DEVIATION 
^SNR 

OBSERVED 

P(C) 

90  PEj^CENT 
ON  SNR 

CONFIDENCE  IiNT. 
ON  I’(C) 

*^10’  ‘^lOSN 

-3.23  db 

2.15  db 

0.78 

-3.99,  -2.47 

0.61,  0.87 

‘^14’  ^14SN 

-0.45  db 

3.26  db 

0.84 

-1.27,  0.37 

0.72,  0.92 

Table  5.6. 


Summary  of  Results  for  Tests  with 
Spectrally  Shaped  Noise 


-45- 


May  28,  1975 
CPJ :clb:c  jg 


indicate  that  there  is  little  subjective  difterence  in  the  sounds  attributable 
to  the  limiting  process.  Those  responses  which  wore  made  occurred  at  a SNR 
near  the  maximum  presented.  At  tliis  high  SNR  the  signals  are  little  contamin- 
ated by  the  background  noise  and  the  task  is  one  oC  matching  an  essentially 
pure  probe  to  the  exposed  signal  set. 

Learning  Effects  iind  Kxperimental  Biases 

In  any  experimental  techniqu;:  such  as  the  modified  threshold  procedure 
used  here,  the  question  of  biases  introduced  by  that  technique  must  be  investi- 
gated. Referring  to  Figure  4.1  it  is  seen  that  in  every  case  signal  B is  pre- 
sented just  prior  to  the  response  period.  It  is  worthwhile,  therefore,  to  ask 
the  question. 

Does  the  order  of  the  signal  exposures  significantly  effect  the  ex- 

perlnu'iital  outcome? 

That  is,  can  we  disprove  the  null  hypothesis  that  there  is  no  order  effect 
due  to  the  sequence  of  the  exposure  signals?  From  Tables  5.1  and  5.4  wo  see 
that  the  below-listed  pairs  of  treatments  differ  only  in  the  order  of  the  ex- 
posure signal . 


1.  1 and  1.  3,  1.  2 and  1.  4,  I.  5 and  1.  7, 

1.  6 and  I.  8,  1.  9 and  I. 11,  1.10  and  1.12, 

1.13  and  1.15,  1.14  and  1.16,  11.3  and  11.5. 

When  tested  by  applying  the  t test  for  difference  of  means  (SNR),  only 

in  the  case  of  treatment  pairs  I.l  and  1.3  is  the  null  hypothesis  disproved  at 
the  0.1  level.  Some  doubts  about  the  calibrations  in  this  case  may  account  for 
this  discrepancy.  On  the  basis  of  these  results,  it  appears  that  there.-ls  not 
a significant  order  effect.  These  findings  allow  pooling  of  the  data  without 
regard  to  exposure  order. 

Another  consideration  In  this  type  of  experiment  is  that  of  the  time  factor. 

In  studies  of  auditory  memory  span,  it  is  routine  to  present  an  exposure  signal 
followed  by  a shadowing  stimulus  prior  to  the  probe  presentation  (Parkinson, 

Parks  and  Kroll,  1971).  The  effect  of  the  shadowing  stimulus  is  quite  pronounced 
when  tests  are  done  with  isolated  letters  on  phonemes  in  a non-rehearsal  situation. 
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If  we  now  consiiU'r  tlie  low  SNR  presentat  ion  at  tlie  Nepinning  of  Lite  response 
period  as  a shadowing  stimulus,  we  may  expect  the  performance  to  decrease  as 
the  tir.u'  to  respond  increases.  Also,  there  is  tlie  cpiestion  of  memory  latency 
whicli  may  enter  in. 

In  order  to  test  for  this  effect,  the  rank  correlation  coefficient  of  SNR 
to  resjrond  vs.  the  time  to  respond  was  compntcxl.  This  vtilue  varied  tietween 
0.03  to  0.33.  Witli  these  data  alone,  however,  it  is  difficult  to  determine  if 
the  rank  correlation  coefficients  on  tlie  order  of  0.3  does  in  fact  indicate  a 
significant  correlation.  If,  for  example,  tlie  initial  SNR  were  always  to  be 
fixed,  tlie  responses  with  higher  SNR  would  ho  perfectly  correlated  with  res- 
ponse time  by  virtue  of  the  fact  that  the  SNR  varies  in  a predi-termined  fashion 

with  time.  We  have  purposefully  randomized  this  initial  SNR  to  reduce  this 
effect.  It  is  not  immediately  clear  how  much  remaining  correlation  one  should 
expect.  In  a later  chapter,  a revised  experimental  approach  is  suggested 
which  may  allow  an  answering  of  the  quest  ion: 

How'  much  effect  is  there  due  to  contamiiiat  ion  of  tlie  reference  sounds 

in  memory  due  to  time  lapse  or  siiadowlng? 

The  responses  for  two  pairs  of  signal  patterns  are  plotted  in  Figures  5.5 

and  5.6.  It  is  seen  that  there  may  in  fact  be  some  t endancy  for  the  recpiiien 
SNR  to  increase  with  response  time.  The  degree  of  this  trend,  if  it  does  in 
fact  exist,  cannot  be  ascertained.  It  should  also  he  pointed  out  that  the 
present  set  of  data  does  not  conform  to  the  test  without  rehearsal  done  by 
other  workers.  The  listeners  were  in  fact  encouraged  to  make  notes  which  could 
help  them  in  the  c lassl f ication  task. 
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aL\?TER  VI 

DISCUSSION  OF  THE  EXPERIMENTAL  RESULTS 

6 .1  1 n l e r et nt  1 on  o f the  Resu Ij^s  from  the  Point  of  i cv  of 

lAnioct  ion  Theory 

In  this  chapter  we  will  analyze  the  rovsults  of  tlie  experiments  to 
date  from  the  point  of  view  of  signal  detection  theory.  Sj^eci  f i cai.ly , the 
questions  raised  in  Chapter  II  about  the  dichotomous  feature  classification 
problem  will  be  pursued. 

Consider  an  auditory  recognition  task  where  two  signal  patterns 
and  Wj  differ  only  by  the  presence  of  feature  in  one  case.  This  feature 

will  have  associated  with  it  a detectability  d'.  We  will  then  model  the 
classification  task  as  a hypothesis  test  designed  to  determine  the  presence 
of  feature  It  is  certainly  not  obvious  that  the  ability  simply  to 

detect  the  presence  of  a feature  is  adequate  to  make  the  pattern  classification. 
There  are,  after  all,  other  features  present  in  both  signal  patterns  which  can 
perfiaps  confound  the  classification  task.  It  will  be  seen,  however,  tiiat 
this  simple  approach  does  yield  some  ratlier  consistent  results. 

6^2 Anal ys Is  of  Resu Its  for  High  Pa ss  Filtering 

Wlien  the  two  pattern  classes  differ  in  that  one  (o)..j|p)  has  the  low 
frequency  noise  components  eliminated  by  filtering,  we  can  define  a dichotomous 
feature  fij ^ which  embodies  all  of  the  low  frequency  information.  This  is 
itself  a feature  complex  in  that  it  consists  of  at  least  the  following  component 

a)  A band-limited  noise  witli  non-uniform  spectral  density  in  tlie 
auditory  band  from  about  120  Hz  to  707  Hz.  (Limited  by  the  tape 
player  at  the  low  end.) 

b)  Amplitude  modulation  components  of  the  above  band-limited  noises 

c)  Perhaps  some  time  dependent  shifts  in  frequency  of  the  spectral 
components  of  this  band-limited  noise. 

d)  Tonals  falling  In  this  band  of  frequencies. 

Further  assume  tiiat  this  dichotomous  feature  can  be  characterized 
as  a first  approximation  as  a band  of  noise  with  rectangular  bandwidth 
and  corresponding  spectral  level  The  classification  under  these 

assumptions  then  becomt^s: 

Conclude  H : decide  signal  pattern  is  U). 

o n I IIP 

Conclude  H^^;  decide  signal  pattern  is  o).  . 


wliore  the  liypo theses  ,'ire: 
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11  : Feature  ,,  is  absent 

o 1.1’ 

11^:  Feature  is  present. 

The  problem  of  detecting  the  hand  of  noi.se  was  di.scu.ssed  in  Chapter  11. 

We  saw  there  that  the  optimum  detector  per f oniiaiue  is  given  by  i:quation  2.2. 

Green  (1960a)  extensively  investigated  listener  jierformance  in  tlie  de- 
tection of  bands  of  noise  in  noise  pioblimis.  lie  found  that  d',  = a d'  , 

obs  opt 

tlie  ob.served  detectability,  for  a large  range  of  Wl'.  For  a l’(C)  = 0.75,  the 
value  of  a was  between  0.25  and  0.33  depending  on  subject.  This  finding  was 
seen  to  hold  for  a large  range  of  center  frequencies  (f^)  iff  <-he  band  of 
noise.  The  range  of  f^,  was  from  600  to  6000  Hz,  W varied  from  655  to  5163  Hz, 
and  T varied  from  3 to  1000  msec.  The  results  above  300  msec  seemed  to  de- 
viate from  the  expected  behavior,  however.  Also,  he  measured  an  f ot  7500  Hz 

c 

but  found  some  anomolies  which  he  iittributes  to  the  effect  of  eariihone  behavior. 

Figure  6.1  combines  all  of  Green's  findings  on  probability  paper.  This 
figure  includes  the  result  of  two  ex[)erifnc'nts  with  five  different  subjects  and 
wiLit  the  following  set  of  parameters. 

f^,  = 600,  800,  1500,  2500,  6500,  6000  Hz. 

W = 655,  3862,  5163  Hz. 

T = 3,  10,  30,  100,  300  msec. 

These  results  arc  for  a 2AFC  procedure  and  the  data  is  normalized  by  using 
the  equation: 

10  log  = 10  log  ~ [l/2(o^/o^y  + (o^  + o^)  + 1]."^^^  .(6.1) 

n 

In  our  results,  the  detectability  of  feature  fl.  .jj,  can  then  be  thought  to 
correspond  to  the  band  of  noise  used  in  Green's  experiments.  When  the  equiva- 
lent square  bandwidth  is  computed  for  the  four  signals  used,  it  is  seen 

that  most  of  the  signal  power  associated  with  S). . falls  in  an  octave  band 
centered  nominally  about  500  Hz.  Hence 

W = 365  Hz. 
cf  f 
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Tilt'  ratio  hc’tvfcn  Lhc  sij'.naj  and  noiso  spectral  level 


0^/0 


2 

u 


is  shown  in  Table  h.l  for  the  four  pairs  of  sip.nal  patterns  used. 

If  we  are  to  normalize  our  data  as  does  Green,  liow  do  we  define  T,  the 
system  integration  lime,  in  the  Ecpiation  6.1?  The  lenp,th  of  time  for  which 
the  SNU  was  maintainetl  constant  in  the  experiment  was  2 seconds.  However, 
the  listener  could  he  int  ep.r.it  1 up,  over  siiceessive  steps.  Or,  the  time  T 
may  be  inm  h shorter  due  to  some  as  yet  unmentioned  meciian  ism.  We  see  that 
the  expression  for  implies  continuously  increasing  detectability  with 

time.  It  is  found,  however,  that  a listener  extracts  all  useful  information 
from  an  ticoust  ic  stimulus  in  the  fir.st  f c-w'  hundred  mi  1 1 i seconds  (Tanner  and 
Sorkin,  1972.)  In  fact,  the  behavior  observed  by  Green  for  the  data  with 
duration  of  1 second  indicates  that 

d'  for  T = 1000  msec  ^ d',  for  T = 300  msec, 
obs  obs 

A detailed  analysis  of  the  response  beliavior  for  the  modified  threshold 
proceduri-  indic.ites  th.it  listeners  exhibit  comparable  behavior  in  this  type 
of  oxiH'riment.  The  number  of  observed  responses  as  a function  of  delay  from 
the  St ej)  change  in  GNR  is  plotted  in  figure  0.2.  This  data  is  for  160  re- 
sponses chosen  at  random  from  among  the  ap|)roximatel y 480  available  events. 

In  Figure  6.2  the  independent  variable  l is  the  deltiy  to  res[)ond  from  the 
onset  of  the  step  change  in  SNR.  The  peak  in  the  number  of  responses  at 
600  msec  is  significant  at  the  0.1  level  under  the  assumption  of  a Poisson 
distribution  for  the  number  of  responses  per  100  msec  of  delay.  This  ob- 
servation supports  the  view  that  each  step  change  in  .SNR  can  be  treated  as  a 
detection  opportunity.  After  a deltiy,  the  listener  has  extracted  all  o.t,  the 
additional  information  provided  by  the  change  in  detectability  associated 
with  the  change  in  SNR  and  either  makes  a terminal  decision  or  tlefers  the 
decision  until  another  such  change  occurs.  It  is  noteworthy  that  this  result 
Is  observed  in  spite  of  the  fact  that  listeners  were  unable  reli.ably  to  des- 
cribe the  way  in  which  the  SNR  changc'd  when  asked  to  verbalize  the  auditory 
sensation.  They  were  unable  consciously  to  tell  if  the  SNR  changed  continuously 
or  in  steps! 
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Figure  6.2  Distribution  of  Responses  as  a Function  of  Delay  fron 
the  Step  Change  in  SNR 
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From  these-  data,  and  alJov.'ing  for  a response  Lime  of  100  msc-c  from 
the  time  of  the  terminal  decision  to  the  recording  of  that  response,  it 
was  inferred  that  an  integration  time  of  500  msec  is  reasonable. 


We  can  then  define  a detectability 

,2 


d' 

ml 


= (W  T - 

off  eff  ^2 

n 


[l/2(o'^/zO^  + (c^,  + ofj  + Ij 


-1/2 


U 


where 

V = 354  Hz,  T = 0.5 
eff  eff 

and  dj^^  is  the  observed  detectability  using  the  modified  threshold  procedure. 

The  point  estimate  and  90  per  cent  confidence  interval  on  10  log  d'  is  also 

rat 

given  in  Table  6.1. 


When  d'  is  compared  to  d',  from  Green's  results,  we  find  that 
mt  obs 

10  Jog  d',  - 10  log  d'.  = 5.3+  1.2 

^ mt  “ obs 

when  the  point  estimate  of  P(C)  is  used  for  comparison  purposes.  It  is 
seen  that  the  four  signals  give  consistent  results  one  to  another  in  spite 
of  the  fact  that  they  differ  significantly  in  their  overall  sound.  Also, 
in  spile  of  the  fact  that  the  listener  instructions  in  all  cases  was  to 
"....indicate  your  choice  only  when  r_eti^tnia_blv  certain.",  the  probabilities 
of  a correct  resj)onse  differed  significantly  especially  in  the  case  of 
signal  pattern  reduced  P(C)  was  also  made  at  a significantly  lower 

SNR,  however.  The  behavior  is  consistent  with  the  results  obtained  by  Green 
and  conforms  to  the  model.  The  reason  for  this  difference  in  what  is 
apparently  the  listener  criterion  is  treated  in  more  detail  in  a later 
section  of  this  chapter. 

Wliat  of  the  five  dB  difference  between  d'  and  Green's  results?  Up  to 

mt 

this  point  we  have  treated  the  classification  as  if  it  were  simply  a detection 
problem  on  feature  However,  the  listener,  in  order  to  establish  a 

threshold  for  the  terminal  decision,  must  also  listen  to  the  rest  of  the  signal. 
Stallard  and  Leslie,  (1974) conclude  on  theoretical  grounds  that  the  difference 
between  a 2AFC  experiment  and  passive  sonar  detection  performance  should  be 
about  5.4  dB.  Tlieir  reasoning  is  as  follows: 

a)  The  effect  of  frequency  uncertainty  because  two  bands  of  noise 
must  be  attended  to  introduces  some  decrease  in  detectability. 


S353I 
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1 b)  Time  uncertainty  about  the  oiif;et  time  of  the  sij’,nal  a], so 

has  the  same  effect. 

c)  They  modeled  the-  passive  sonar  problem  as  a YN  task  and 
added  another  1.5  dB  for  the  difference  between  the  effi- 
ciency of  a VN  and  2AFC  test. 

This  latter  correction  was  made  in  our  case  by  the  definition  of  il ' 

opt 

hence  is  not  applicable.  It  is,  therefore,  suggested  by  the  analysis  pre- 
sented by  Stallard  and  Leslie  that  the  present  results  should  be  related  to 
Green's  (1960a)  results  by  a factor  of  about  ^ dB,  or: 

d'  = 2.5  • d',  , theoretically, 

mt  obs 

The  remaining  discrepancy  of  about  1 dB  is  of  doubtful  significance. 

More  important  is  the  matter  of  which  SN'R  to  use  in  computing  the  d^'^^  at 
the  time  of  tlie  terminal  decision.  If  the  listener  somehow  integrates  over 
previous  observations,  tlie  apparent  SNR  at  tlie  t ' th  observation  would  be 
higher  than  measured.  That  is,  if  at  observation  t-1  the  listener  is  aware 

I 

of  the  fact  tliat  the  log  likelihood  ratio  almo.st  exceeded  the  threshold, 
this  is  useful  information  and,  hence  decreases  the  uncertainty.  (Raisbeck, 
1963.)  In  fact,  Swets  and  Green  (Swets,  196^0  have  demonstrated  tliat  a 
listener  is  indeed  capable  of  integrating  tfie  information  in  successive  ob- 
servations in  very  specialized  circumstances.  In  general,  liowever , they 
note  that : 

"This  analysis  leaves  little  doubt  that  the  assumption  of  no  integration 
over  successive  observations  is  a good  one " 

That  is  not  to  say  that  the  thresholds  y.  and  Yd  arc  not  influenced  by 

A B 

the  sequential  nature  of  this  task.  These  thresholds  affect  the  respective 
probabilities  of  correct  and  wrong  classifications  but  not  the  form  of  • 

In  Flgjire  6.3,  the  90  per  cent  confidence  intervals  for  these  experimental 
results  are  plotted  on  probability  paper  along  with  a fourth  order  polinomial 
regression  fit  to  Green's  data  (Green,  1960  a,  McGill,  1968).  The  predicted 

* I 

performance  for  a sonar  detection  problem  is  also  shown  (Stallord  and  Leslie, 
1974.) 
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^•_L_  AiutJjs  i.‘:  of  Retail  I s 0_hl  a i lu'd  in  I'.xpcrimcMits  Uslny,  Slmpocl  Noise 

In  tlic  oxnc  rime  Ills  wilti  sliaind  noise  vs.  Ltic  recorded  marine  sourui , tlie 

dir tioLomons  feature  was  eitJier  the  presence  of  amplitude  modulation  (fi  ) in 

/\M 

the  ease  of  sipn.al  pair  presence  of  a strong  tonal  ) 

for  signal  pair  ^'VSN  ‘ f^t’illier  one  of  these  features  is  as  simple  as 

iniplied,  however.  Tlie  spectrum  of  these  sources  e.xhihits  frequency  as  well  as 
amplitude  variations  with  time.  Also,  signal  has  a family  of  tones  with 

only  one  pronounced  steady  tone  ami  some  varying  components. 

The  feature  aijpcars  as  a repealed  burst  of  noise  impressed  upon  a 

continuous  spectrum.  Miller  and  Taylor  (Miller,  191^18,  Miller  and  Taylor,  19A8) 
have  investig.ated  the  subjective  character  of  this  type  of  signal.  It  was 
found  that  the  differential  threshold  for  intensity  Al/1  increases  as  the 
duration  of  the  added  burst  of  noist^  decreases.  In  the  limit  where  tlie  duration 
of  the  added  increment  exceeds''about  250  msec.,  the  performance  is  given  by  the 
Weber  fraction 


Al/I  = 0. I , t > 250  msec.  (Green,  1960a)  . 

For  the  signal  in  question,  the  natural  modulation  corresponding  to  feature 

il...  is  in  the  form  of  short  bursts  of  noise  which  are  repeated  more  or  less 
AM 

periodically  at  a rate  of  8 to  15  Hz.  The  duration  of  the  noise  pulse  is  ap- 
proximately 25  msec,  but  with  a non-rec tangu 1 ar  waveform.  From  Figure  5.1  we 
see  that  the  effective  bandwidth  of  the  signal  vs.  noise  background  is  approxi- 
mately 6 kHz  from  AGO  Hz  to  6A00  Hz  at  the  3 db  down  points.  The  Weber  fraction 
Al/1  is  then  the  incremental  change  in  intensity  vs.  average  intensity  just  per- 
ceptible. Moore  and  Raab  (1975)  find  that  the  Weber  fraction  is: 

-3.4  db  for  3160  Hz  bandwidth  and  duration  10  msec. 

-0.9  db  for  1000  Hz  bandwidth  and  duration  10  msec. 

-5.5  db  for  3160  Hz  bandwidth  and  duration  250  msec. 

-A. 9 db  for  1000  Hz  bandwidth  and  duration  250  msec. 

-7.3  db  for  a 18  kHz  bandwidth  and  duration  250  msec. 

To  compare  the  present  results  with  tlicse  findings,  we  must  know  the  ampli- 
tude excursion  associated  wltli  the  noise  burst.  The  spectral  level  near  the 
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center  12.5  msec  time  period  of  the  amplitude  modulation  pulse  was  measured 
to  be  4 db  higber  than  the  average  spectral  level.  From  the  results  showai  in 
Table  5. A,  the  following  can  be  calculated: 

a.  The  average  signal  plus  noise  level  was  +1.65  db  relative  the  back- 
ground level  alone.  The  action  of  the  auto:iiatic  loudness  level  control  circuit 
would  be  nearly  to  eliminate  this  effect. 

b.  Tlic  point  estimate  of  instantaneous  signal  plus  noise  level  was  then 
+3.5  db  relative  the  background  level  or  1.85  db  relat  ive  the  average  signal 
plus  noise  ratio.  This  results  in  a observed  Weber  fraction  of  -2.7  db.  The 
90  per  cent  confidence  interv.al  on  the  obsei  ved  Weber  fraction  in  our  case  is: 

-3.5  db  ■-  W ^ < -2.6  db  . 

obs 

When  tliese  results  are  compared  to  those  of  Moore  and  Raab  (1975)  it  is 
seen  that  the  present  observed  SNR  to  respond  agrees  reasonably  well.  The 
pulsc-s  com])rising  the  modulation  fall  betwen  these  investigated  by  these  authors 
and  therefore  c.innot  be  directly  compared  with  those.  However,  extrapolating 
their  results  using  the  empirical  method  proposed  by  them,  the  predicted  Weber 
fraction  would  be  about  -4.0  db.  The  difference  between  this  predicted  value 
and  the  results  obtained  using  the  modified  thresliold  procedure  is  certainly 
not  unreasonable  considering  the  simplifying  assumptions.  Hence,  as  for  the 
classification  of  the  high  pass  filtered  signal,  the  results  observed  can  be 
explained  reasonably  well  by  simply  assuming  that  the  problem  is  one  of  de- 
tecting a dichotomous  feature.  It  must  be  noted,  however,  that  treating  these 
pulse-like  increases  in  the  amplitude  as  isolated  pulses  to  be  detected  in  a 
noise  background  must  be  done  with  caution.  These  pulses  occur  often  enough 
SO  that  they  are  approaching  an  indistinguishable  series  of  pulses  as  investigated 
by  Miller  and  Taylor  (1948).  The  critical  frequency  is  about  20  Hz  and  is  also 
a function  of  duty  factor. 

In  the  case  of  feature  tone  at  about  2.5  kHz  has  a spectral 

level  which  is  some  21  db  per  Hz  greater  than  is  the  1 Hz  broadband  spectral 
level  of  the  signal.  The  point  estimate  of  tlie  SNR  required  for  classification 
was  -0.45  db.  The  calibration  in  this  case  was  made  in  the  l/3rd  octave  band 
which  contained  the  tone.  However,  to  obtain  the  average  broadband  backg.round 
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against  which  tho  tnm'  is  heard,  wo  take  tlio  average  spectral  level  in  the 
adjacent  1/lrd  octave  bands  as  representing  the  signal  background  level. 

Wlien  treated  in  this  way,  we  obtain: 

a.  Tl>e  signal  j'lus  noise  to  noise  background  at  the  Lime  of  classifi- 
cation responses  was  9 1 . <'i  db  relative,  the  background  level.  Tins  is  the 
point  of  normalization. 

b.  The  spectral  level  of  the  principal  tone  was  then  19.5  db  in  a 1 Hz 
band  relative  tlie  background  or  about  18  db  relative  the  signal  plus  noise 
to  noise  reference  point. 

Hawkins  and  Stevens  (1950)  investigated  the  required  spectral  level  of  a 
tone  for  it  to  be  just  audible  in  a broad  noise  background.  They  found  that 
an  average  relative  spectral  level  of  about  20  db  is  required  for  a tone  at 
2.5  kHz  to  bo  audible.  The  results  found  in  the  present  experiment  are  con- 
sistent v;ith  these  findings.  Tlie  difference  noted  is  well  within  experimental 
errors  in  calibration  and  spectral  mcanurt'wciit . 

While  the  publisiied  data  against  which  the  present  results  are  compared 
do  not  specifically  define  a detectability  d'  for  the  respective  features, 
similar  results  have  been  obtained  by  workers  using  signal  detectability 
theories  (Green,  1968,  Green,  1970,  Swets,  Green  and  Tanner,  1962,  Tanner,  1961). 
The  results  obtained  by  Moore  and  Raab  (1975)  are  somewhat  at  variance  with  a 
model  of  the  auditory  process  as  an  energy  detector  and  a difference  of  some  3 
to  5 db  is  observed  by  them  between  their  studies  and  the  energy  detection 
model.  In  any  case,  the  present  results  agree  reasonably  well  with  the.  findings 
of  a number  of  workers  who  have  concentrated  on  the  signal  detection  task. 
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CILM’TER  VII 

SUIIMAUY  AND  CONCLUSIONS 


7 . 1 Summary 

In  this  rejiort  the  complex  task  of  aural  classification  of  a source,  of 
sound  is  reduced  to  a simpler  probJcm  of  deciding  between  two  sound  patterns. 
It  was  further  assumed  that  in  some  cases  this  process  couid  be  treated 
as  a detection  problem  v/herc  the  listener  must  first  have  the  opportunity  to 
detect  a signal  feature  before  he  can  make  a classification.  When  the 
classification  is  considered  in  this  way,  it  was  found  that  for  six  cases, 
three  entirely  different  dichotomous  features,  the  observed  classification 
performance  is  within  a few  dB  of  results  predicted  by  other  researchers. 

The  studies  against  whicli  the  present  experimental  results  are  compared  v;ere 
done  with  much  simpler  signals  and  all  treated  only  the  detection  problem. 

The  good  agreement  between  the  present  results  and  those  obtained  by 
others  argues  in  favor  of  an  interpretation  of  these  experiments  as  tlie 
detection  problem  of  a dichotomous  feature.  Certainly  this  is  an  over- 
slmj)!  ification,  because  the.  criterion  with  which  the  listeners  responded  seems 
to  dejjend  on  other  characteristics  of  the  signal.  It  is  a simplification 
which,  liowcver,  is  useful  in  tiic  interpretation  of  a very  complex  problem 
in  cognition. 


If  indeed  the  results  can  be  interpreted  on  the  basis  of  this  simplifying 
model,  the  Implications  are  far  ranging.  It  is  possible  then,  for  example, 
to  predict  directly  from  the  results  of  fundamental  studies  what  the  effect 
of  cutoff  frequency  of  a filter  will  be  on  classification  performance.  A 
wealth  of  such  results  is  available  in  the  literature  some  of  which  is  .Usted 
in  the  bibliography  of  this  report.  Likewise,  the  effect  of  certain  types  of 
modulation,  and  of  tones  can  be  directly  inferred. 


Another  significant  finding  to  date  is  the  fact  that 
equally  proficient  in  making  a decision  about  the  absence 
are  about  its  presence.  7'his  result  was  obtained  for  the 
two  signals  had  equal  a priori  probability  of  occurrence, 
each  type  of  error  was  nominally  the  same. 


listeners  are 
of  a feature  as 
case  where  the 
and  the  penalty 


they 

for 
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7.2  I.ir.iitat  ions  on_  tlu;  rre5ajiU__Rcsii  J 

When  al  Leni])!  Inp  to  relate  the  present  findings  to  the  anticipated 
per f orr.iance  of  a sonar  operator,  some  distinguishing  features  must  be 
considered.  The  listeners  used  here  were  trained  college  students  with 
consi<ierabl e expirienco  in  critical  listening.  They  have  not  been  trained 
ill  the  sonar  classil  ication  task,  hov.’ever,  and  tlie  signal  patterns  to  wliich 
they  were  exposed  were  essentially  context  free.  Also,  they  did  not  need 
to  cl.issify  on  the  basis  of  Jearned  and  retained  information;  rather  it 
was  a matching  problem  with,  at  most,  intermediate  duration  memory. 

A sonar  operator  generally  v;ould  need  to  classify  targets  into  one 
of  a substantial  set  of  possible  targets  although  at  times  a two-point 
classification  may  be  possible.  The  sonar  operator  also  is  able  to  alternately 
listen  to  the  target  of  interest  and  the  ambient  noise  by  means  of  a 
trainable  beam  (in  most  systems.) 

The  extent  to  which  t hie  training  of  a sonar  operator  al  lows  film  to 
outperform  thie  colJcge  students  will  be  investigated  in  experiments  to  be 
conducted  at  rlie  Sonar  Scliooi  in  San  Diego  in  Aug.ust  i 975.  These  rests  and 
thieir  principal  objectives  are  described  in  an  appendix  hereto. 

Thie  experimental  teclmique  used,  the  m.odified  threshold  procedure,  aiJows 
only  £1  determination  of  listener  performance  at  the  time  tliat  a terminal 
decision  is  made.  While  thiis  is  itself  a useful  bit  of  information,  in  many 
cases  it  would  be  desirable  to  know  how  tlie  performance  (l.e.  P(c))  varits 
with  signal~to“nolse  ratio. 

2^.  3 _ Suggest  ions  _f or  I-^uture  Studies 

To  address  the  question  of  how  tlie  classification  performance  vanes  withi 
slgnal-to-noise  ratio,  two  additional  sets  of  experiments  are  proposed. 

Thie  first  of  these  will  be  conducted  at  tlie  Sonar  ScliooJ  in  conjum-t  ion 
with  ilic  modified  threslioid  procedure.  Thie  equipment  uses  tlie  same  sign;;! 
presentation  sequence  as  is  shown  in  Figure  A.i.  Tlie  iistenei  will,  hiowiver, 
continuously  indicate  hiis  degree  of  confidence  and  liis  tentative  ci assi t i tat  ion 
decision  by  means  of  a linear  potentiometer  gradtuited  as  below. 

_ A hi 

10  8 6 V 2 0 2 4 6 8 Yo 


-63 


May  28,  1975 
C;PJ:c]b:cjg 


Tlie  listc'iior  will  indicate,  say,  a 2 toward  the  B if  he  has  low  confidence 
that  the  probe  signal  corresponds  to  exposure  signal  B and  will  tiove  up  the 
scale  as  his  confidence  increases.  Some  preliminary  tests  of  this  scheme 
ascertain  that  the  listeners  are  able  to  change  from  an  initial  classification 
of  say  A to  the  alternate  classification.  Interpretation  of  tliese  results 
is  not  trivial,  however,  since  the  confidence  scale  tends  to  measure  the 
process : 

State  a confidence  of  a classification  given 
that  an  immediately  preceding  confidence  was 
thought  to  apply. 

The  second  test  to  obtain  more  knowledge  about  the  psychometric  function 
will  be  conducted  at  The  Pennsylvania  State  University  thanks  to  the  efforts 
of  Dr.  James  Martin.  These  will  be  two  alternative  forced  response  tests 
(2APC)  using  a large  number  of  naive  college  students.  Wiile  the  performance 
for  such  listeners  is  expected  to  be  less  predictable,  the  experiment  will 
also  be  simpler.  These  results  will  principally  test  the  degree  to  which  the 
modified  threshold  procedure  conforms  to  standard  signal  detectability  measures. 

An  additional  test  will  be  made  in  the  sonar  school  studies  of  the  effect 
of  shadowing  alluded  to  in  a previous  chapter.  An  adequate  number  of  events 
will  be  interspersed  with  other  modified  thresliold  tests  to  address  this 
point.  For  these  events,  an  additional  30  si?cond  delay  will  be  introduced  by 
a timer  betv;een  the  onset  of  the  response  period  and  the  beginning  of  the 
step  increase  in  SNR.  It  is  then  possible  to  test  the  effect  of  a prolonged 
shadowing  period  independently  using  the  t test  for  difference  of  means. 


I 
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At’J'l'.NDlX  A 

su^i>L\KY  or  Momriru  TiiRr.SHOLD  tests 

AT  Till-;  SONAR  SCIiOOE 

Plans  arc  to  conduct  extonsivo  tests  of  tlie  concepts  i>ro])OEed  in  tills 
report.  These  tests  will  be  held  in  August  1975  at  the  ASW  School,  San  Diego, 
California.  Tentatively,  it  is  planned  to  use  8 senior  sonar  operators  for 
five  day.s  to  obtain  in  excess  of  800  events.  Of  these,  720  events  will  make 
use  of  the  modified  threshold  procedure  to  determine  the  SNR  required  to  make 
a terminal  classification  decision. 

The  tests  are  designed  to  test  the  following: 

a)  How  do  results  obtained  to  date  using.  University  students  compare 
with  those  using  trained  sonar  operaturfj? 

b)  Verify  asjjects  of  the  classification  model,  as  proposed,  to  identify 
the  range  of  applicability  in  the  sonar  context  . 

c)  Attempt  to  measure  the  significance  of  various  fetitures  in  the 
classification  task. 

d)  I’rovide  mure  data  points  for  signals  of  special  significance  to 
the  .sponsor. 

e)  Attempt  to  determine  the  effect  of  memory  and  shadowing  as  related 
to  this  type  of  task. 

The  treatments  to  be  applied  to  the  population  of  trained  sonar  operators 
are  summarized  in  Table  A. 3.  Treatments  1-9  v/ill  be  tested  with  a sufficient 
number  of  events  to  allow  estimation  of  the  respective  probabilities  l’(C) 
and  P(E),  tlie  probability  of  a correct  terminal  decision  and  of  an  incorrect 
terminal  decision.  Treatments  10-17  will  be  used  specifically  to  test  for 
differences  in  the  mean  signai-to~noise  ratio  (SNK)  required  for  a terminal 
decision.  The  t test  for  difference  of  means  applies  in  these  cases. 
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Featui c 

n Characterization  cf  tlie  Feature 


1 

Presence  of  an  octave  band  stationary  noise  with  = 500  Hz. 

2 

I’resence  of  an  octave  band  statlonaiy  r.oisc  witli  f^  = 4k.  Hz. 

3 

Square  wave  10  Hz.  amplitude  noJuiat  1 on  of  an  octave  b.and  with  *=  500  Hz. 

4 

Square  wave  10  Hz.  at.ip]!tud<-  modulat  ion  of  an  octave  band  wilH  f^  4k  Hz. 

5 

Pre.senee  of  an  octave  band  stationary  noise  with  1^  = 2.50  Hz. 

0 

Bl.ide  rale  amplitude  modulation  of  ati  octave  hand  with  f^  **  4k  Hz. 

7 

Marine  source  spectinm  witHcut  amplitude  modulation 

8 

Hladc  rate  ainplitudo  nodulatioii  of  a broadbaml  r-.ariiic  source 

9 

Marine  source  liir.li  pass  spectrum  with  f - 707  H.c . and  amplitude  luodujalion 

10 

Marino  source  bandpa.s.s  spectrum  with  = 500  llz  . and  amplituilc  r^odul.rtion 

11 

Marine  source  handjass  siiectrum  with  f^  = 250  Hz.  and  amplitude  modulation 

12 

Marine  source  low  pass  spectrum  with  f^  = 177  Hz.  and  amplitude  modulation 

Tabic*  A.l.  Summary  of  DichoLomous  features 


Pat  t era 


Features  Comprising;  tlie  I’attirn 


■IH 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

1 

1 

1 

1 

a 

1 

1 

B 

1 

5 

B 

1 

6* 

.25 

B 

7 

H 

1 

8 

1 

1 

1 

9 

1 

1 

10 

1 

1 

11 

1 

12 

1 

13 

1 

1 

1 

14 

1 

1 

15 

1 

1 

1 

1 

*A  feature  Is  present  (1)  or  has  l/4th  relative  detectability  (.25), 
Table  A.  2.  Sununary  of  Signal  Patterns 
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SlhNA), 

(i':H 

0 

I'A'I  ] 1 1;:;'. 

KK.NM,  ’1(1  IIOIM  ! 
KAI  In  RAM  0;  (Oil)  j 

1 

To  lost  till'  rfti'it  of  1 h<*  pnv.fncf  of 
H fi.inJ  of  iK'i:.*' 

2 

1 

-s.o, 

8.0 

2 

To  tost  tit'-  (ffrct  (»f  .It  .j- 1 \ t m!o  nodu- 
lol  Ion  (ni  .1  li.ind  of  iiv>iso 

1 

3 

-8.0, 

S.O 

3 

To  Mif  (fieri  of  tli-  pi  i -.etire  uf 

n b.iiul  ('1  jMi;;o  iti  the  pit'.eiuo  ol 
nrup ] i 1 udi-  i tinlii  Lit  i on 

6 

-S.O, 

8.0 

u 

To  test  ifie  effetl  of  1 lie  pr<  sence  of 
an  anplilnile  ik»<Iu  1 di  cii  i-.n.-l  ol  noise 

2 

3 

-8.0. 

s.o 

b 

To  lest  the  efitft  (*1  eh.mpinp,  the  de- 
lectahiliiy  of  a d i c.  h'*i  e;,  .)io.  U’siiuie 

? 

-2.0, 

11.0 

6 

To  deternine  t Ik*  (ffect  il  ai  t ua  1 ciod- 
ulnfion  x.’.ivef  onus  applieil  to  n hand 
of  noise 

h. 

- 

-S.O. 

8.0 

7 

To  dcK-rnine  llu*  effort  (»f  slrippinp. 
tlio  TTi(»clu  1 .it  i on  lien,  a iiceidiil  sip.- 
nal 

11.11 

i(*.  1 1 

-9.0, 

4.0 

8 

To  cv.nlnale  t lie  perforntanee  for  i e« 
folded  f.ip.iials  with  a t.iissinp,  hand 
of  f m ies 

13.12 

r>.  i:* 

-2.0, 

11.0 

9 

To  deternine  how  the  ciiteiiiin  differ.*; 

between  f’n  1 Vers  J ( V .sfudi-nis  itu<\ 

' 

soiirU  oper  .il  Ol  s 

9.12 

IS.  12 

-2.0,  11.0  1 

1 

1 

1 

)0 

To  te.st  for  the  dlfferetnc  in  SN'fl  he- 
tween  lhiiveif.il  y .slndeiits  and  sonar 
operatoi;.,  hij\h  p.i.ss  i.ise 

6.11 

1 S.  1 ! 

-3.0, 

10.0 

1 1 

To  t(*st  for  the  differemc  in  SNK  he- 
lwe(n  l’niv<-rsiry  filndeuts  ati<l  sonai 
operators,  hip.h  pa.ss  <a‘.e 

9. 10 

1 . I 0 

-4.0, 

9.0 

12 

To  lest  for  the  differenee  in  SNK  hc- 
tween  I’niveisity  student  5;  and  sonar 
operator. s , shaped  noise  case 

11.10 

10.10 

-9.0, 

4.0 

13 

To  dv*ternine  the  effect  of  an  extended 
fihadowinp,  period  on  per  forn.ince 

2 

6 

-2.0. 

11.0 

14 

To  lest  the  effect  of  the  jiresenct*  of 
a fi.ind  of  noise  cent<'itd  at  ?50  II/- 

2 

7 

-s.o. 

8.0 

16 

To  detemiine  the  effect  of  ,nn  extended 
shadow]  n>;  i>i*r  iod  on  per  t 01  n,ancc 

9 

8.11 

-s.o, 

8.0 

16 

To  cv.iln.ite  effect  on  SNK  of  cfi.inp,lnp. 
the  hip.h  p.tss  cutoff  fi<-(piency 

14.11 

IS. 11 

-2.0, 

11.0 

17 

To  evaluate  eflect  on  SNK  of  ch.inpinp 
the  hiph  pass  cutoff  frccpieney 

14.12 

IS.  12 

0.0, 

13.0 

Kcaliirrs  wlilch  user  r ct o t d i-il  tn.iiiiu'  sounds  arn  IdciUlfli'd 
liy  llip  correspond  1 ny,  stynni  idrni  1 1 ic.it  Ion 


Table  A. 3.  Summary  of  Treatments  Using  the  Modified  Threshold 
Procedure  Sclieduled  for  Sonar  School  Tests 


UK  i.Vi.:, 

7? 

72 
72 

72 
60 
82 

72 

7? 

72 

i7 

J2 

12 

12 
16 
12 
24 
24 
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AI'l'l'-NDIX  n 
IIAKJWAUE  SYST12IS 

The  tapes  used  as  the  source  of  the  auditory  stiinulli  in  tliese  tests 
are  tlie  end  product  of  a caiefully  controlled  series  of  taping  steps.  Tlie 
marine  sources  of  interest  come  recorded  on  a larp,e.  variety  of  mactiiues, 
at  various  tape  speeds  and  recording  tt'chniques.  Selected  ]>ortions  of  tliese 
ruw  source  tapes  are  ro-recorded  onto  1/2  inch  magnetic  tapes  in  KM  recording 
modi  and  at  60  or  30  inciies  per  second.  liuring,  the  re-recording  of  these 
master  tapes,  adjustment  is  made  for  hydrophone  or  other  frequency  weighting 
appearing  on  the  raw  data.  The  signals  are  also  prewhito.ned  by  an  amplifier 
with  a 6 dli  per  octave  increasing  gain  vs  frequency  behavior. 

Tlie  master  tapes  serve  as  inputs  to  any  stimulii  recordings  which  use 
actual  recorded  marine  sounds.  Alternately,  a weighted  noise  source  may  be 
substituted  for  either  member  of  tlie  exposure  set.  Recorded  ocean  ambient 
or  shaped  noise  serves  as  the  background  noise  against  which  the  probe  signal 
is  presented.  These  various  sig.nals  are  properly  time  multiplexed  by  means 
of  an  analog  selector  v.’hich  receives  control  signals  from  a digital  sequencer. 
Figure  A.l  shows  the  various  components  of  the  system  needed  to  create  primary 
tapes  from  master  tapes.  The  flow  of  analog  and  digital  signals  is  also 
shown.  In  Figure  A.l,  those  components  vdiich  are  not  specifically  indicated 
as  being  commercial  items  were  fabricated  at  the  Applied  Research  Laboratory. 
Most  of  these  components  were  designed  and  built  by  the  author.  The  major 
portions  of  the  system  with  the  exception  of  the  analog  input  tape  unit  are 
pictured  in  Figure  A. 2.  Necessary  interconnecting  cabling  has  been  omitted 
from  this  photograf>h,  however.  Figure  A. 3 shows  the  sequencer/controller  in 
greater  detail.  ^ 

The  digital  sequencer  and  associated  controller  orchestrates  the  required 
functions  in  response  to  a simple  program  stored  in  a read-only  memory.  The 
sequencer  starts  and  stops  the  output  tape  drive,  sequences  the  various  audio 
signals,  and  causes  the  generation  of  control  signals  which  will  be  used  at 
the  test  site  to  indicate  response  periods  and  to  record  responses.  The  exact 
order  of  the  process  is  determined  by  a program  selector  and  by  the  state  of 
sense  switches.  The  setting  of  the  balanced  mixer  is  performed  manually  in 
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response  to  a que  light  which  indicates  the  response  period  portion  of  the 
audio  setjuence. 

A direct  recorded  monitor  channel  of  tlie  output  tape  is  used  for  voice 
annotation  of  cut  numbers  on  the  primary  tapes  and  to  record  a 12.5  kllz  pilot 
tone.  This  tone  is  on  during  Valid  portions  of  data  on  an  FM  recorded  audio 
channel  and  an  FM  recorded  control  ct.annel  . Tliis  control  channel  has  tones 
for  control  of  the  response  recorder  and  a balanced  mixer  setting  proportional 
VCO  output. 

Cuts  from  the  i>rimary  tapc;s  are  re-rccorded  onto  the  final  audio  tapes 
in  a randomised  order.  The  monitor  channel  of  the  primary  tape  is  used  both 
to  locate  the  desired  cuts  and  to  allow  for  voice  annotation  between  events 
on  the  audio  tapes.  This  voice  annotation  of  event  numbers  and  of  special 
Instructions  to  the  listener  is  usually  from  a cassette  recording.  A phase 
locked  loop,  analog  selector  combination,  insures  tliat  the  objectionable 
FM  discriminator  noise  output  while  searching  the  primary  tape  does  not  appear 
on  the  tapes  used  for  listener  tests.  Figure  A. A diagrams  this  aspect  of 
the  recording  process. 
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